Blog

Blog

“ Mach Speed Horizontally Scalable Time series database. ”

Machbase 6.1 vs InfluxDB 2.7

by Grey Shim / 18 Oct 2023

Introduction

In the last article, we compared InfluxDB 1.8.1 with Mchbase. While 1.8 is installed by default using the Linux package manager, we wanted to see how it compares to Machbase 6.1, the latest open-source version, 2.7.

Machbase

Machbase is a DBMS designed for fast input, search and statistics of time series sensor data.

It supports regular single servers, multi-server clusters, including edge devices such as Raspberry Pi. It has special features and architecture for processing time series sensor data and machine log data.

InfluxDB

It is an open-source time series DBMS developed by InfluxData. It is one of the most popular products for processing time series data. InfluxDB also supports clustering.

For more information about InfluxDB, see the sitebelow. https://docs.influxdata.com/influxdb

Test
Test environment

For this test, we used the following environment.

  • CPU : AMD EPYC 7742 64-Core Processor(128 thread)
  • Memory : 256GB
  • Disk : Samsung NVME 2TB(PCI Express 3.0 x 4)
  • OS : CentOS Linux release 7.8.2003
  • Database : Machbase 6.1.8 Fog / InfluxDB v2.7.0
Test constraints

In InfluxDB 2. x and above, the removal of the database concept from version 1.8, the addition of buckets and organizations, the reduction of SQL support, and the enhancement of security and authentication with tokens make existing client applications incompatible. As a result, existing tests written in SQL cannot be used and will be executed by rewriting queries in the Flux language.

Machbase used the existing test results.

Test input performance

With the update of InfluxDB from 1.8 to 2.7, resource usage has increased significantly.

As before, we ran tests in the following environments

Ingestion performance test results

Despite the increased resource headroom on our test machine, InfluxDB’s input performance is slightly worse than before (160,000 eps). You can see that InfluxDB input performance has not improved with the version.

Performance testing of queries

InfluxDB is reducing support for SQL queries and increasing support for flux, its own query language. As a result, the existing test query tool is no longer available, and the following SQL query has been transformed into a flux query for testing purposes. The query is shown in the figure below.

  • Machbase(SQL) / Q1
  • select count(*) from tag;
  • ㆍInfluxDB(Flux) / Q1
  • from(bucket:"sensor_data/autogen")

    |> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)

    |> filter(fn: (r) => r._measurement == "tag_data" )

    |> group()

    |> count() |> yield(name: "count")
  • Machbase(SQL) / Q2
  • select count(*) from (select * from tag where name = 'EQ0^TAG1' and

    time between to_date( '2018-01-01 00:00:00') and to_date( '2018-01-02 00:00:00'));
  • ㆍInfluxDB(Flux) / Q2
  • from(bucket:"sensor_data/autogen")

    |> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)

    |> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" )

    |> group()

    |> count() |> yield(name: "count")
  • Machbase(SQL) / Q3
  • select count(*) from (select * from tag where name in ('EQ0^TAG1', 'EQ0^TAG2', 'EQ0^TAG3',

    'EQ0^TAG4', 'EQ0^TAG5', 'EQ0^TAG6', 'EQ0^TAG7', 'EQ0^TAG8', 'EQ0^TAG9', 'EQ0^TAG10',and

    time between to_date( '2018-01-01 00:00:00') and to_date( '2018-01-02 00:00:00'));
  • InfluxDB(Flux) / Q3
  • from(bucket:"sensor_data/autogen")

    |> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)

    |> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" or r.name == "EQ0^TAG2" or r.name == "EQ0^TAG3" or r.name == "EQ0^TAG4" or r.name == "EQ0^TAG5" or r.name == "EQ0^TAG6" or r.name == "EQ0^TAG7" or r.name == "EQ0^TAG8" or r.name == "EQ0^TAG9" or r.name == "EQ0^TAG10" )

    |> group()

    |> count() |> yield(name: "count")
  • Machbase(SQL) / Q4
  • select name, stddev(value) from tag where name = 'EQ0^TAG1'

    and time >= to_date
    '2018-01-01 00:00:00') and time < to_date('2018-01-02 00:00:00')

    group by name;
  • InfluxDB(Flux) / Q4
  • from(bucket:"sensor_data/autogen")

    |> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)

    |> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" and r._field == "value" )

    |> stddev()

    |> group(columns: ["_measurement"], mode: "by")
Test results for query performance

The query execution time was measured using Machsql and influx executable tools respectively. The results are shown below.

Comparison of test results in aggregate

The performance of the first query to get the number of records is so different that the performance of the other three queries is shown below.

Conclusion

Similar to the previous Machbase vs. InfluxDB performance comparison, we see that InfluxDB is very slow on input and Machbase is better on queries. We can also see that InfluxDB has reduced its support for SQL and increased its support for proprietary query languages, making it less compatible with existing applications.

Machbase promises to continue to maintain compatibility with existing programs by adhering to industry standards, including SQL, to improve support for various new APIs in Machbase neo and to continue to improve performance.

Thank you.

Machbase CRO, Grey Shim

@2023 MACHBASE All rights reserved.