News

Mai 2018

Pensionskasse: Anlagestrategie ist keine Geheimsache

Als wir von unserer Pensionskasse wissen wollten, wie […]

April 2018

Neue Webseite!

Nach fünf Jahren haben wir unserer Webseite ein neues Design verpasst […]

April 2018

Scala Meetup: Traceable Data Entities with Spark 2.x

One of the most common abstraction for a big data platform is a “Data Lake” […]

Februar 2018

First Scala Meetup in Bern

Tegonal would like to invite you to the first Scala Meetup in Bern […]

Juni 2017

Not so Open Government

Neue Frameworks und Technologien werden oft mit Hilfe von kleinen Schulbeispielen […]

Juni 2017

R.I.P. jusearch!

Tegonal stellt die Rechtssuchmaschine ju§earch! ein. […]

März 2017

Mitgliederportal von OpenOlitor online

In der Regionalen Vertragslandwirtschaft (RVL) erhalten […]

Februar 2017

Tegonal unterstützt WE SHAPE TECH

Das Netzwerk von Frauen aus dem Technologiebereich trifft sich […]

Zum Newsarchiv

Scala Meetup: Traceable Data Entities with Spark 2.x

One of the most common abstraction for a big data platform is a “Data Lake”. Data is brought into the lake, then it’s filtered, parsed, transformed and in the process many more data assets are created. Metadata describes the data and with the growing amount of data, it is becoming more and more important and harder to properly describe the data, the schema and the data lineage.

Spark (2.x) is one of the most important tool in data engineer’s/scientist's toolbox, but currently it offers very little help how to connect the input data sources the output data sources.

At Sqooba they decided to extend spark’s built-in event mechanism to get more granular data about when data is used as an input or output for a spark application and that allows us to use event listeners to update the relevant entities on Apache Atlas to get real time Data Lineage and Metadata.

Bern Scala User Group