Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
62f68b7
Make funcs private
Jul 3, 2025
95e33dc
Make consts private
Jul 3, 2025
6ce88a5
Make methods private
Jul 3, 2025
2f2f885
Restore doc
Jul 3, 2025
9def8eb
Rename package
Jul 3, 2025
1ae7a48
Add checker draft
Jul 4, 2025
96537b4
Add checkers cmp
Jul 4, 2025
fa46937
Fix
Jul 4, 2025
953178b
Use new string checker. Fix cmp func
Jul 4, 2025
e34dcf0
Fix linter errors and doc
Jul 7, 2025
e1a93a5
Add checker test
Jul 7, 2025
75f834a
Add new antispam ctor draft
Jul 7, 2025
f2081ad
Extract values
Jul 7, 2025
1e09f0e
Rename
Jul 8, 2025
89811b4
Change method signature
Jul 8, 2025
dd2eaeb
Add ctors
Jul 8, 2025
f3987a9
Refactor
Jul 8, 2025
c7b8d32
Use new antispam rules
Jul 8, 2025
be731a2
Fix linter errors
Jul 8, 2025
e586edf
Edit doc
Jul 8, 2025
979a92f
Fix doc
Jul 8, 2025
94479e0
Make funcs private
Jul 11, 2025
af8ce33
Return result instantly if blocked
Jul 11, 2025
24865b7
Rename 'limit' => 'threshold'
Jul 14, 2025
5baea95
Add test utils. Change meta tag parsing
Jul 14, 2025
90210f4
Refactor
Jul 14, 2025
d9b80ea
Separate logic op. Refactor
Jul 14, 2025
bdf6942
Refactor
Jul 14, 2025
7ba838e
Refactor
Jul 14, 2025
a85e7ba
Edit doc meta file
Jul 14, 2025
4aa922a
Extract value node
Jul 14, 2025
932b8b9
Impl logical node ctor
Jul 14, 2025
6c95ba9
Rename files
Jul 14, 2025
49d08cb
Add rules ctor
Jul 15, 2025
2e9156e
Remove antispam v2
Jul 15, 2025
351dbe6
Impl IsSpam with rules. Cast exceptions to rules
Jul 16, 2025
840f16a
Add 'enabled' flag
Jul 16, 2025
80d891e
Change meta key data tag separator
Jul 16, 2025
e0021eb
Dedublicate AnyToInt
Jul 16, 2025
64c5180
Extract field name. Use rule ctor
Jul 16, 2025
ba5937b
Deduplicate ctor utils
Jul 16, 2025
092a7be
Deduplicate logical node extraction
Jul 16, 2025
df1277d
Refactor
Jul 23, 2025
dce0cca
Remove label
Jul 23, 2025
237105f
Fix
Jul 24, 2025
c86c703
Restore ctor utils
Jul 24, 2025
6197e13
Refactor
Jul 24, 2025
6b424d5
Refactor
Jul 24, 2025
06094fd
Refactor
Jul 24, 2025
c53756e
Refactor
Jul 24, 2025
db7028e
Refactor
Jul 24, 2025
1380060
Edit check func
Jul 25, 2025
bdf843f
Move package logic
Jul 25, 2025
ce658c4
Add doc
Jul 25, 2025
630e036
Refactor
Jul 25, 2025
ae814f4
Parse data type tag in do_if package
Jul 25, 2025
faa4eb8
Add 'unite_sources' flag
Jul 28, 2025
9638af3
Refactor
Jul 28, 2025
7efa1d3
Rework tests
Jul 28, 2025
209fb91
Add tests
Jul 28, 2025
faba299
Add tests
Jul 29, 2025
7b22e78
Rename package
Jul 29, 2025
c2773b5
Fix tests
Jul 29, 2025
924efb7
Add field op data type parsing test
Jul 29, 2025
cd7b41c
Rename fieldOpNode => stringOpNode
Jul 30, 2025
fdb8f36
Fix
Jul 30, 2025
0bca733
Add draft test
Jul 31, 2025
2dae4f5
Restore old test
Jul 31, 2025
66c521a
Edit convertor test
Aug 4, 2025
ff57d51
Add test cases
Aug 4, 2025
6ed13d8
Add tests
Aug 5, 2025
f6624a7
Restore old test again
Aug 8, 2025
c617d85
Rename tests
Aug 8, 2025
5cf374a
Refactor
Aug 8, 2025
ab9dc6a
Refactor
Aug 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions Insanedocfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,17 @@ extractors:
fn-list: '"fn-list" #4 /Plugin\)\s(.+)\s{/'
match-modes: '"match-modes" /MatchMode(.*),/ /\"(.*)\"/'
do-if-node: '"do-if-node" /Node(\w+)\s/'
do-if-field-op: '"do-if-field-op" /field(\w+)OpTag\s/'
do-if-logical-op: '"do-if-logical-op" /logical(\w+)Tag\s/'
decorators:
config-params: '_ _ /*`%s`* / /*`default=%s`* / /*`%s`* / /*`options=%s`* /'
fn-list: '_ _ /`%s`/'
match-modes: '_ /%s/ /`match_mode: %s`/'
do-if-node: '_ /%s/'
do-if-field-op: '_ /%s/'
do-if-logical-op: '_ /%s/'
templates:
- template: docs/*.idoc.md
files: ["../pipeline/*.go"]
- template: pipeline/*.idoc.md
files: ["*.go"]
- template: pipeline/doif/*.idoc.md
- template: pipeline/do_if/*.idoc.md
files: ["*.go"]
- template: plugin/*/*/README.idoc.md
files: ["*.go"]
Expand Down
17 changes: 17 additions & 0 deletions cfg/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -731,3 +731,20 @@ func mergeYAMLs(a, b map[interface{}]interface{}) map[interface{}]interface{} {
}
return merged
}

func AnyToInt(v any) (int, error) {
switch vNum := v.(type) {
case int:
return vNum, nil
case float64:
return int(vNum), nil
case json.Number:
vInt64, err := vNum.Int64()
if err != nil {
return 0, err
}
return int(vInt64), nil
default:
return 0, fmt.Errorf("not convertable to int: value=%v type=%T", v, v)
}
}
32 changes: 32 additions & 0 deletions cfg/matchrule/matchrule.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,19 @@ func (m *Mode) UnmarshalJSON(i []byte) error {
return nil
}

func (m *Mode) ToString() string {
switch *m {
case ModeContains:
return "contains"
case ModePrefix:
return "prefix"
case ModeSuffix:
return "suffix"
default:
panic("unreachable")
}
}

const (
ModePrefix Mode = iota
ModeContains
Expand Down Expand Up @@ -66,6 +79,14 @@ type Rule struct {
prepared bool
}

func (r *Rule) GetMinValueSize() int {
return r.minValueSize
}

func (r *Rule) GetMaxValueSize() int {
return r.maxValueSize
}

func (r *Rule) Prepare() {
if len(r.Values) == 0 {
return
Expand Down Expand Up @@ -186,6 +207,17 @@ var (
condOrBytes = []byte(`"or"`)
)

func (c *Cond) ToString() string {
switch *c {
case CondAnd:
return "and"
case CondOr:
return "or"
default:
panic("unreachable")
}
}

type RuleSet struct {
// > @3@4@5@6
// >
Expand Down
22 changes: 0 additions & 22 deletions decoder/common.go
Original file line number Diff line number Diff line change
@@ -1,27 +1,5 @@
package decoder

import (
"encoding/json"
"errors"
)

func anyToInt(v any) (int, error) {
switch vNum := v.(type) {
case int:
return vNum, nil
case float64:
return int(vNum), nil
case json.Number:
vInt64, err := vNum.Int64()
if err != nil {
return 0, err
}
return int(vInt64), nil
default:
return 0, errors.New("value is not convertable to int")
}
}

// atoi is allocation free ASCII number to integer conversion
func atoi(b []byte) (int, bool) {
if len(b) == 0 {
Expand Down
3 changes: 2 additions & 1 deletion decoder/json.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"slices"
"sync"

"github.com/ozontech/file.d/cfg"
insaneJSON "github.com/ozontech/insane-json"
"github.com/tidwall/gjson"
)
Expand Down Expand Up @@ -140,7 +141,7 @@ func extractJsonParams(params map[string]any) (jsonParams, error) {
return jsonParams{}, fmt.Errorf("%q must be map", jsonMaxFieldsSizeParam)
}
for k, v := range maxFieldsSizeMap {
vInt, err := anyToInt(v)
vInt, err := cfg.AnyToInt(v)
if err != nil {
return jsonParams{}, fmt.Errorf("each value in %q must be int", jsonMaxFieldsSizeParam)
}
Expand Down
10 changes: 7 additions & 3 deletions fd/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,14 @@ import (
"github.com/ozontech/file.d/logger"
"github.com/ozontech/file.d/pipeline"
"github.com/ozontech/file.d/pipeline/antispam"
"github.com/ozontech/file.d/pipeline/doif"
"github.com/ozontech/file.d/pipeline/do_if"
)

func extractPipelineParams(settings *simplejson.Json) *pipeline.Settings {
capacity := pipeline.DefaultCapacity
antispamThreshold := pipeline.DefaultAntispamThreshold
var antispamExceptions antispam.Exceptions
var antispamCfg map[string]any
sourceNameMetaField := pipeline.DefaultSourceNameMetaField
avgInputEventSize := pipeline.DefaultAvgInputEventSize
maxInputEventSize := pipeline.DefaultMaxInputEventSize
Expand Down Expand Up @@ -101,6 +102,8 @@ func extractPipelineParams(settings *simplejson.Json) *pipeline.Settings {
}
antispamExceptions.Prepare()

antispamCfg = settings.Get("antispam").MustMap()

sourceNameMetaField = settings.Get("source_name_meta_field").MustString()
isStrict = settings.Get("is_strict").MustBool()

Expand Down Expand Up @@ -129,6 +132,7 @@ func extractPipelineParams(settings *simplejson.Json) *pipeline.Settings {
CutOffEventByLimitField: cutOffEventByLimitField,
AntispamThreshold: antispamThreshold,
AntispamExceptions: antispamExceptions,
Antispam: antispamCfg,
SourceNameMetaField: sourceNameMetaField,
MaintenanceInterval: maintenanceInterval,
EventTimeout: eventTimeout,
Expand Down Expand Up @@ -219,13 +223,13 @@ func extractMetrics(actionJSON *simplejson.Json) (string, []string, bool) {
return metricName, metricLabels, skipStatus
}

func extractDoIfChecker(actionJSON *simplejson.Json) (*doif.Checker, error) {
func extractDoIfChecker(actionJSON *simplejson.Json) (*do_if.Checker, error) {
m := actionJSON.MustMap()
if m == nil {
return nil, nil
}

return doif.NewFromMap(m)
return do_if.NewFromMap(m)
}

func makeActionJSON(actionJSON *simplejson.Json) []byte {
Expand Down
72 changes: 71 additions & 1 deletion pipeline/antispam/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,77 @@ In some systems services might explode with logs due to different circumstances.

The main entity is `Antispammer`. It counts input data from the sources (e.g. if data comes from [file input plugin](/plugin/input/file/README.md), source can be filename) and decides whether to ban it or not. For each source it counts how many logs it has got, in other words the counter for the source is incremented for each incoming log. When the counter is greater or equal to the threshold value the source is banned until its counter is less than the threshold value. The counter value is decremented once in maintenance interval by the threshold value. The maintenance interval for antispam is the same as for the pipeline (see `maintenance_interval` in [pipeline settings](/pipeline/README.md#settings)).

## Exceptions
## Antispam config

Example:

```
antispam:
threshold: 3000
rules:
- name: alert_agent
if:
op: and
operands:
- op: contains
data: meta.service
values:
- alerts-agent
- op: prefix
data: event
values:
- '{"level":"debug"'
threshold: -1
- name: viewer
if:
op: and
operands:
- op: contains
data: source_name
values:
- viewer
threshold: 5000
```

Antispammer iterates over rules, checks event and applies first matched rule.
If event does not match any rule it will be limited with common threshold.

### Antispam fields

**`threshold`** **`int`**

Common threshold applied to events that don't match any rule.
Values:
- `-1` - no limit;
- `0` - discard all logs;
- `> 0` - normal threshold value.

**`rules`**

Antispam rules array

### Rule fields

**`name`** **`string`**

Name of rule. If set to nonempty string, adds label value for the `name` label in the `antispam_exceptions` metric.

**`threshold`** **`int`**

Rule threshold. Has the same value meanings as common threshold.

**`if`**

`do_if`-like condition tree (see [doc](../do_if/README.md)).
Difference is we allowed only logical and data operations.
We use `data` to point data to check instead of `field`.
Values:
- `event`
- `source_name`
- `meta.name` - get data to check from metadata by key `name`


## Exceptions [deprecated: use rules instead]

Antispammer has some exception rules which can be applied by checking source name or log as raw bytes contents. If the log is matched by the rules it is not accounted for in the antispammer. It might be helpful for the logs from critical infrastructure services which must not be banned at all.

Expand Down
Loading
Loading