Leveraging the Elastic Common Schema

Introduction

The Elastic Common Schema (ECS) is a new normalized format proposed by the Elastic community. Although still in beta status, it is already usable, concrete and more importantly promising. The ECS idea is simple: benefit from a common specification to structure the data indexed in Elasticsearch. Such a data normalization makes it simple and efficient to process data coming from various sources (firewalls, systems, servers, applications, sensors, whatever).

What are the key strengths of ECS ?

An identical naming convention for all Elasticsearch-based projects
The entire Elastic suite is adopting this nomenclature (beats, logstash, APM, …)
The proposed field naming is both simple and clear.
It makes it easier to design common Kibana dashboards acting on different sources of data
It simplifies the user experience navigating from one domain to another: cybersecurity, network, application, cloud resources etc..

To learn more, check out this excellent webinar. The source code is directly available on github.

ECS and the Punch

We did not wait for the Elastic team to come up with a standard taxonomy to tackle this kind of problems. Several nomenclatures to name the log data fields already exist. We based ours on Open XDAS. Open XDAS is a standard model for working and thinking with events related to network components and security data analysis. Because the punch is used in production for several years now, we have enriched and adapted this format with our additional functional needs.

What is the interest of the Punch to move from Open XDAS to ECS then? Simply said it is strategic. The punch is above all an Elasticsearch-centric solution, and ECS precisely aims at unifying the various data source formats so that it becomes simple and efficient to process that data as one well-normalized dataset. In turn, such a common schema will allow the emergence of a market place providing elasticsearch users with processors, parsers, dashboards.

In addition, some of the beats already rely on that schema. The elastic stack soon to come Version 7 will use it for the majority of its components. In short: we have been waiting for such a standard for a long time.

Punch ECS Integration

The Punch is particularly modular by design. To allow our users to easily switch to ECS, we simply provide a new Punchlet. A punchlet is a micro-processing module, that user can deploy anywhere in their pipelines. This new punchlet is named the “ecs-convertor.punch” and is now available on the Thales inner source punch marketplace, and of course to our external customers.

This is yet another example where the punch pipeline modularity excels. Using a simple single punchlet, our 70 standard log parsers can now generate ECS conforming data just by a configuration change. What you do is to switch from :

to:

Where the ECS punchlet is depicted in red. It cannot be simpler.

ECS in Action

Let us see ECS in action. Once ECS data has been ingested in Elasticsearch, here is a standard and simple dashboard (one available on the Punch market place). It illustrates how various log data coming from different sources can all be visualised on a common dashboard. Note that this is ready to be demonstrated on a punch standalone distribution.

Let us focus on the details to better understand what the ECS is about. Consider an HTTP access Apache log. Here is a comparison of the final document obtained before and after the conversion to ECS format.

The original apache log is:

Feb 21 10:48:35 host2 189.134.68.95 - alice [31/Dec/2012:03:00:00 +0100] "GET /software/winvn/index.php?q=3#article HTTP/1.0" 404 8368 "http://www.example.com/start.html" "Mozilla/5.0 (Linux; Android 5.1.1; Nexus 5 Build/LMY48B; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/43.0.2357.65 Mobile Safari/537.36"

Here are the fields:

Converted to ECS	Original Punch fields	Value
ecs.version	–	v1.0.0-beta2
labels.channel	channel	apache
labels.tenant	tenant	mytenant
labels.vendor	vendor	apache_httpd
event.type	type	web
event.action	action	Not Found
event.created	obs.ts	2012-12-31T03:00:00.000+01:00
event.original	message	Feb 21 10:48:35 host2 189.134.68.95 – alice [31/Dec/2012:03:00:00…
event.alarm.id	alarm.id	160018
event.severity	alarm.sev	2
source.address	–	189.134.68.95
source.ip	init.host.ip	189.134.68.95
source.geo.city_name	init.usr.loc.cty_short	Mexico City
source.geo.country_iso_code	init.usr.loc.country_short	MX
source.geo.country_name	init.usr.loc.country	Mexico
source.geo.location.lat	init.usr.loc.geo_point[1]	19.4342
source.geo.location.lon	init.usr.loc.geo_point[0]	-99.1386
source.user.name	init.usr.name	alice
observer.hostname	obs.host.name	host2
http.request.method	web.request.method	GET
http.request.referrer	web.header.referer	http://www.example.com/start.html
http.response.body.bytes	session.out.byte	8368
http.response.status_code	web.request.rc	404
http.version	web.header.version	1.0
url.path	target.uri.urn	/software/winvn/index.php?q=3#article
url.query	–	q=3
url.fragment	–	article
user_agent.original	web.header.user_agent	Mozilla/5.0 (Linux; Android 5.1.1; Nexus 5 Build/LMY48B; wv …
–	parser.name	apache_httpd
–	parser.version	1.2.0
–	col.host.name	punch-elitebook
–	lmc.parse.host.ip	127.0.0.1
–	lmc.parse.host.name	punch-elitebook
–	lmc.parse.ts	2019-03-06T14:30:05.552+01:00
–	obs.ts	2012-12-31T03:00:00.000+01:00
–	rep.host.name	host2
–	rep.ts	2019-02-21T10:48:35.000+01:00
–	size	325

What Next?

The punch parsers are provided as modules. They can be deployed in various pipelines. Leveraging the ECS format the punch will make their customers themselves pay much more attention to their data normalization. And in turn, they will immediately benefit from the Elastic ecosystem. Well, this actually is already the case, we now deploy solutions based on beats rather than any other metric or log agents for various Thales applications.

And because they do that, they can now execute machine learning processings on top of their data using the Punch PML feature.

Stay tuned for more news on this soon.

Guillaume Fayemi, Dimitri Tombroff

Thanks for reading our blog.

Questions ? contact@punchplatform.com

Author

Guillaume F

View all posts

Leveraging the Elastic Common Schema

Published by Guillaume F on March 10, 2019March 10, 2019

Introduction

ECS and the Punch

Punch ECS Integration

ECS in Action

What Next?

Author

Architecture

Rust, WebAssembly, TensorFlow, and RISC-V: a powerful cocktail for edge AI.

Architecture

Sigma rule processing using streaming SQL

Announcement

Why we use Punch

Leveraging the Elastic Common Schema

Published by Guillaume F on March 10, 2019March 10, 2019

Introduction

ECS and the Punch

Punch ECS Integration

ECS in Action

What Next?

Author

Related Posts

Architecture

Rust, WebAssembly, TensorFlow, and RISC-V: a powerful cocktail for edge AI.

Architecture

Sigma rule processing using streaming SQL

Announcement

Why we use Punch