Query Optimizer

Download 33
Last updated Nov 9, 2025

User Guide

This documentation explains how the Query Optimizer app optimizes Logpresso queries, and under what circumstances.

1. Search Pushdown Optimization

SearchPushdownPlanner is an optimization that moves search conditions as far forward (closer to the data source) as possible, reducing unnecessary data processing.

1.1 Concept Summary

  • Pulls search conditions from the back to the front so that

    • Filtering occurs at the table or fulltext stage, reducing subsequent computation
  • When possible

    • Merges search conditions into table/fulltext index queries for improved speed
  • Supported scope

    • Comparison operators: ==, !=, >, >=, <, <=
    • Null checks: isnull, isnotnull
    • Logical operators: and, or
    • Some type conversions: date, ip, time field (_time) conditions

1.2 Automatic Time Function Conversion (ago, now to date)

Commonly used functions like ago and now are internally converted to date functions first. This process is performed automatically and does not change the query semantics.

Why convert?

  • Search pushdown and index merging operate on simple constant-form time values.
  • Since ago and now are functions, they are difficult to merge directly. Converting them to date form with actual time strings enables index optimization.

Conversion Examples

Before:

table from=20251108 to=20251109 web_logs
| search _time >= ago("1h")

After (e.g., if current time is 2025-11-08 23:00:00):

table from=20251108 to=20251109 web_logs
| search _time >= date("2025-11-08 22:00:00", "yyyy-MM-dd HH:mm:ss")

Before:

table from=20251108 web_logs
| search _time < now()

After (e.g., if current time is 2025-11-08 23:00:00):

table from=20251108 web_logs
| search _time < date("2025-11-08 23:00:00", "yyyy-MM-dd HH:mm:ss")

Usage Notes

  • Only string constant arguments like ago("1h"), ago("30m") are supported.
  • now is always subject to conversion.
  • In special cases where internal time conversion fails, the original query is safely preserved.

1.3 Search Pushdown Execution Flow

  1. Traverses the query from left to right to find search commands.

  2. Checks whether each search expression is in a pushdown-eligible form.

  3. From that position, walks backward through preceding commands.

  4. Depending on the command type:

    • Pass-through: moves search further forward
    • Blocked: no further movement possible at that point
    • Special handling: rename, dataset, join, etc.
  5. Moves the search to the farthest possible pushdown point, or pushes it into the data source (table, fulltext, dataset, join) for optimization.

When There Are Multiple search Commands

Example:

cmd1
| search A
| search B
| search C
  • Processes in left-to-right order: A, then B, then C.
  • Each is independently moved as far forward as possible, but the original order (A, B, C) is preserved.

1.4 Pushdown-Eligible search Conditions

The following simple filter conditions are eligible for pushdown:

  • Comparison operators

    • ==, !=, >, >=, <, <=
  • Null checks

    • search isnotnull(user_id)
    • search isnull(error_code)
  • Logical operators

    • A and B, A or B
  • Type conversions

    • date, ip functions
  • Specially handled != operations

!= Operation Handling Rules

  • Only numeric, IP, and date types are eligible for index merge optimization. e.g., search status_code != 500, search src_ip != ip("192.168.0.1")

When converting to index form, the following interpretation is applied to preserve semantics:

search status_code != 500
-> (status_code != null) and (status_code != 500)

A simple != 500 comparison does not originally imply excluding null values, so a != null condition is appended to ensure the semantics remain unchanged.

String != Comparisons Are Not Optimized

Example:

search method != "POST"
  • Since indexes do not distinguish between uppercase and lowercase for string comparisons, the meaning could change.
  • In this case, pushdown/merge optimization is not applied, and the original is preserved as-is.

1.5 Relationship Between Key Commands and search

1) Relationship with order

Before:

table web_logs
| order by _time
| search status_code == 200

Since order only handles sorting, search can be moved forward.

After:

table web_logs
| search status_code == 200
| order by _time

2) Relationship with fields

  • If the fields selected by fields include the field used in search, pass-through and movement occurs
  • If not included, no further movement is possible at that point

Pass-through case:

table web_logs
| fields status_code, method
| search status_code == 200

->

table web_logs
| search status_code == 200
| fields status_code, method

Blocked case:

table web_logs
| fields method
| search status_code == 200

-> No change

3) Relationship with rename

Before:

table web_logs
| rename status_code as code
| search code == 200
  • The field name code used in search is reverted to the pre-rename name status_code before pushdown.

After:

table web_logs
| search status_code == 200
| rename status_code as code

4) Pushdown into dataset Subqueries

Before:

dataset [
  table web_logs
  | fields status_code
]
| search status_code == 200

After:

dataset [
  table web_logs
  | fields status_code
  | search status_code == 200
]
| # pushdown: search status_code == 200

Even if the user writes dataset ... | search ..., internally the filtering occurs at the data retrieval stage, providing a performance benefit.

5) Relationship with join

Supported scope and constraints:

  • Only single-key joins are supported (composite-key joins are not eligible for pushdown)
  • Only == comparisons on the join key are pushed down
  • Cross joins are not eligible for pushdown

Before:

table web_logs
| join type=inner user_id [ table users ]
| search user_id == 1000

After (conceptual):

table web_logs
| search user_id == 1000
| join type=inner user_id [
    table users
    | search user_id == 1000
  ]

Search on fields other than the join key are not moved by default.


1.6 Search to Index Merge (Merge Phase)

After pushdown, search conditions are merged into index queries with table or fulltext commands.

1) Numeric (Integer, IP, Date) Comparisons

The following conditions are directly converted to indexes, and the original search is removed:

  • Integer comparison: status_code == 200, bytes > 1024
  • IP comparison: src_ip == ip("192.168.1.1")
  • Date comparison: _time > date("2025-01-01", "yyyy-MM-dd")

Before:

table from=20250101 to=20250102 web_logs
| search status_code == 200

After:

fulltext from=20250101000000 to=20250102000000 tt=t (status_code == 200) from web_logs

Floating-point (float, double) comparisons are not yet eligible for index merge optimization. e.g., response_time > 1.5 remains as a search.

2) String == Comparisons

Example:

table ... | search method == "POST"
  • Both index merge and the original search are performed simultaneously.
  • Since index search is case-insensitive, the original search provides an additional filtering pass to ensure exact case matching.

Conceptually:

fulltext ... ((method == "POST") and (other conditions)) from web_logs
| search method == "POST"

3) String != Comparisons

String != comparisons such as search method != "POST" are excluded from optimization for the reasons described above, and the original query is not modified.


1.7 _time Conditions and Period (from/to) Merging

_time filters are merged directly into the from and to options of table or fulltext commands when possible.

Supported Time Functions

  • date(value, format)
  • String constant arguments like ago("1h")
  • now()

Merge Rules (Conceptual)

  • The overall time range is [from, to) (from inclusive, to exclusive)
  • Uses second-level precision, and milliseconds are appropriately floored or ceiled during range calculation.

Example:

Before:

table from=20251108 to=20251109 web_logs
| search _time >= date("2025-11-08 10:00:00", "yyyy-MM-dd HH:mm:ss")
  and _time < date("2025-11-08 18:00:00", "yyyy-MM-dd HH:mm:ss")

After:

table from=20251108100000 to=20251108180000 web_logs

When only _time conditions are present, table is not converted to fulltext; only from and to are adjusted to reduce unnecessary data scanning.

Empty Time Range

If the final from value becomes equal to or greater than the to value due to _time conditions, no results can exist, so the entire query is converted to result 0.

Example:

Before:

table from=20251108 to=20251109 web_logs
| search _time < date("2025-11-07", "yyyy-MM-dd")

After:

result 0

Millisecond Handling

Time comparisons involving milliseconds only merge the covering range, and the original search is preserved.

Example:

table from=20251108 to=20251109 web_logs
| search _time >= date("2025-11-08 13:00:00.500", "yyyy-MM-dd HH:mm:ss.SSS")
  • from is floored to 13:00:00
  • The search condition is preserved as-is to ensure precise millisecond filtering.

1.8 When Search Pushdown Is Halted

Search is no longer moved forward when the following commands are encountered:

  • Driver commands that define a new data source e.g., table, fulltext, dataset, json
  • Transformation commands that do not yet support pushdown e.g., eval, rex, parse, lookup, some join/union cases, etc.

In these cases, the query functionality itself is preserved as-is; only the performance optimization is foregone.


2. Stats Fields Pushdown Optimization

StatsFieldsPushdownPlanner automatically inserts fields before aggregation commands like stats, pivot, rollup, and cube, keeping only the fields that are actually needed for aggregation.

2.1 Operation Overview

  1. Finds stats, pivot, rollup, and cube commands in the query.

  2. Calculates the fields required for that aggregation.

    • Group by targets (by clause, rows/cols)
    • Aggregation function argument fields (e.g., sum(bytes) -> bytes)
  3. Inserts fields immediately before the aggregation command.

  4. Walks backward through preceding commands

    • Adjusts and cleans up existing fields, rename, eval, etc.
    • Removes unnecessary fields

As a result, unnecessary fields are removed before the aggregation stage, reducing memory usage and CPU overhead.

2.2 Example

Original:

log schema="web_logs"
| rename status_code as code
| search method == "GET"
| eval kb = bytes / 1024
| stats sum(kb) by code

Result:

  • Fields needed for aggregation

    • group by: code -> original field status_code
    • Aggregation argument: kb -> original field bytes
    • Field needed for search: method
log schema="web_logs"
| fields method, bytes, status_code
| search method == "GET"
| rename status_code as code
| eval kb = bytes / 1024
| stats sum(kb) by code

3. Fulltext Condition Deduplication

FulltextDeduplicationPlanner removes completely identical conditions that are duplicated within a fulltext query.

Before:

fulltext ((type == "TRAFFIC") and (dst_port == 5432) and (type == "TRAFFIC")) from logs

After:

fulltext ((type == "TRAFFIC") and (dst_port == 5432)) from logs

This simplifies the query without changing its semantics, reducing internal evaluation cost. No user action is required.


4. Redundant order Removal

RedundantOrderRemovalPlanner removes unnecessary intermediate order commands that do not affect the final field sorting state.

Before:

table web_logs
| order _time
| search status_code == 200
| fields status_code, method

After:

table web_logs
| search status_code == 200
| fields status_code, method

When commands that determine the final field order (such as fields, rename, or a trailing order) exist later in the pipeline, the preceding order is deemed to have no meaningful impact on the result and is removed.


5. Diagnosing Optimization with explain

When a query does not seem to be transformed as intended, you can check the optimization process using the explain command.

Example:

explain [
  log schema="session" alias=t
  | search start >= ago("10m")
  | stats count by src_ip
  | search is_changed
]

Running this displays a table with the following columns:

  • step: optimization order
  • planner: the name of the applied optimization planner (e.g., log-command-rewriter, search-pushdown-optimizer, fulltext-deduplication-optimizer, stats-fields-pushdown-optimizer, etc.)
  • is_changed: whether the query was actually changed compared to the previous step
  • query: the query string at that step

This allows you to verify step by step:

  • When and which planner was applied
  • Whether search pushdown, _time merging, stats fields reduction, etc. were actually applied

6. When the Query Is Not Transformed as Expected

In most cases, optimization only changes performance, and query results (row count, values) should remain identical. If the query results differ after following the procedure below, it is likely a bug.

  1. Run the query with the optimization app enabled.

  2. Run the same query again with the optimization app disabled.

  3. If comparing the two results shows

    • Different row counts, or
    • Different statistical values, or
    • Specific records disappearing or appearing, etc. the result data itself is different.

In such cases, please submit an inquiry to the Logpresso support portal with the following information:

  1. The complete original query that caused the issue

  2. Materials that allow comparison of the same query

    • Executed with the app enabled
    • Executed with the app disabled (result screenshots, downloaded result files, etc.)
  3. The result of wrapping the query (executed with the app enabled) in explain

    e.g.,

    explain [
      log schema="session" alias=t
      | search start >= ago("10m")
      | stats count by src_ip
    ]
    
    • Screenshot of the explain output
    • If possible, the result text downloaded via the download button in the upper right corner
  4. A brief description of the difference between expected and actual behavior

By submitting cases where query results differ between app enabled and disabled states to the support portal, the development team will analyze the cause based on the explain results and provide explanations of constraints or improvements through patches as needed.


7. Query Writing Tips Summary

  • Using search with simple comparisons maximizes optimization effectiveness.
  • It is recommended to specify time filters based on _time and use them together with from and to.
  • Complex string functions are not currently eligible for pushdown, so combining them with preprocessing search when possible is recommended.
  • If optimization does not occur as expected, check the explain results first, and if needed, submit them to the support portal for analysis and patching.