Analytixus [ˌa.naˈlyː.tɪk.sʊs] the Analytics Druid
Analytixus brings order to complex data landscapes: With a consistent metadata-first approach, metadata becomes the central source of truth – and from this, code, pipelines, transformations, and documentation are automatically generated.
Every data and analytics project (data lakehouse) needs a framework with best practices. This one is definitely worth reading if you’re looking for a data modeling approach that doesn’t overcomplicate things, focuses on data quality, and supports massive parallel processing.
In modern Data & Analytics projects, metadata is the backbone of automation and governance. Most of the time, we can derive structural metadata directly from the code — for example, by parsing SQL CREATE TABLE statements into XML-based models.
But sometimes, SQL alone isn’t enough. There are attributes we want to capture — like IsBusinessKey, IsTechnical, or History Flags — that cannot be expressed syntactically in the DDL.
This is where code comments come into play. In Analytixus, we use comments as metadata carriers. During the build process, they are extracted, transformed, and converted into actionable metadata.
From Code to Metadata
Here’s an example generated with an XSLT template:
CREATE OR REPLACE TABLE `WWI`.`Store_Countries`
(
-- Technical Fields
`DWH_DEF_DATE` TIMESTAMP NOT NULL /*IsTechnical*/
,`DWH_LOAD_ID` INT NOT NULL /*IsTechnical*/
,`DWH_RECORD_ID` BIGINT GENERATED ALWAYS AS IDENTITY NOT NULL /*IsTechnical*/
,`DWH_SOURCE_ID` INT NOT NULL /*IsTechnical*/
,`DWH_SOURCETABLE` STRING /*IsTechnical*/
-- Custom Fields
, `CountryID` INT NOT NULL
, `CountryName` STRING NOT NULL
, `FormalName` STRING NOT NULL
);
Notice the inline comment /*IsTechnical*/. In the resulting XML metadata, this becomes:
Then, attributes like HistObjectId and OriginId are resolved, and the metadata is enriched into supervised metadata:
History objects are linked.
Origins are connected to their Bronze counterparts.
Relationships and constraints are automatically added.
This metadata can now be consumed by further XSLT templates and DNA-ML models.
Benefits of Using Comments as Metadata
✅ Flexibility: Developers influence metadata directly in the code.
✅ Consistency: Attributes are extracted systematically.
✅ Traceability: Origins and history objects provide lineage across layers.
✅ Automation: Downstream templates can act on enriched metadata.
Conclusion
By using code comments as metadata carriers, we bridge the gap between SQL definitions and rich semantic models.
In Analytixus, comments are not just documentation — they are drivers of automation. They ensure that attributes like IsBK, IsTechnical, and IsHist become first-class citizens in the metadata layer.
This approach transforms unsupervised metadata into supervised metadata, enabling automation, lineage, and governance without slowing down development.
Cookie Consent
We use cookies to improve your experience on our site. By using our site, you consent to cookies.
This website uses cookies
Websites store cookies to enhance functionality and personalise your experience. You can manage your preferences, but blocking some cookies may impact site performance and services.
Essential cookies enable basic functions and are necessary for the proper function of the website.
Name
Description
Duration
Cookie Preferences
This cookie is used to store the user's cookie consent preferences.
30 days
These cookies are needed for adding comments on this website.
Name
Description
Duration
comment_author
Used to track the user across multiple sessions.
Session
comment_author_email
Used to track the user across multiple sessions.
Session
comment_author_url
Used to track the user across multiple sessions.
Session
Google reCAPTCHA helps protect websites from spam and abuse by verifying user interactions through challenges.
Name
Description
Duration
_GRECAPTCHA
Google reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
179 days
These cookies are used for managing login functionality on this website.
Name
Description
Duration
wordpress_logged_in
Used to store logged-in users.
Persistent
wordpress_sec
Used to track the user across multiple sessions.
15 days
wordpress_test_cookie
Used to determine if cookies are enabled.
Session
Stripe is a payment processing platform that enables businesses to accept online payments securely and efficiently.
Contains information related to marketing campaigns of the user. These are shared with Google AdWords / Google Ads when the Google Ads and Google Analytics accounts are linked together.
90 days
__utma
ID used to identify users and sessions
2 years after last activity
__utmt
Used to monitor number of Google Analytics server requests
10 minutes
__utmb
Used to distinguish new sessions and visits. This cookie is set when the GA.js javascript library is loaded and there is no existing __utmb cookie. The cookie is updated every time data is sent to the Google Analytics server.
30 minutes after last activity
__utmc
Used only with old Urchin versions of Google Analytics and not with GA.js. Was used to distinguish between new sessions and visits at the end of a session.
End of session (browser)
__utmz
Contains information about the traffic source or campaign that directed user to the website. The cookie is set when the GA.js javascript is loaded and updated when data is sent to the Google Anaytics server
6 months after last activity
__utmv
Contains custom information set by the web developer via the _setCustomVar method in Google Analytics. This cookie is updated every time new data is sent to the Google Analytics server.
2 years after last activity
__utmx
Used to determine whether a user is included in an A / B or Multivariate test.
18 months
_ga
ID used to identify users
2 years
_gali
Used by Google Analytics to determine which links on a page are being clicked
30 seconds
_ga_
ID used to identify users
2 years
_gid
ID used to identify users for 24 hours after last activity
24 hours
_gat
Used to monitor number of Google Analytics server requests when using Google Tag Manager