Metamapper
Metamapper is an open-source metadata management platform that aims to make it easier to share data and its context across your organization. It's a self-updating data catalog complete with full-text search, an integrated commenting system, and much more.
What we're trying to accomplish
Growing organizations rely on data and analytics to drive decisions. With the emergence of tools like Airflow and companies like Segment and Fivetran, it's never been easier to get data into your warehouse.
But with all of this data comes a lot of noise. It can become difficult to keep track of things like business purpose and/or timeliness of your data, amongst other things. Plus, writing and maintaing that sort of documentation is just plain boring.
Metamapper aims to automate those boring documentation tasks and reduce the time that data engineers spend answering redundant questions. Just connect your data warehouse and Metamapper will periodically scan the datastore and maintain a commentable data catalog that can be viewed by your team via the UI.
Think of it as Google for your data warehouse – perform a search and it'll find the data that best fits your needs.
Here are a few features of Metamapper:
- Browser-based: Everything in your browser, with a shareable URL you can give to your team.
- Schema inspection: Metamapper crawls your database schema(s) every few hours and maintains a comprehensive data catalog.
- Change detection: Detects when data definitions change between schema inspection runs. Useful for alerting uncommunicated changes.
- Annotations: Supports comments on almost every object so your team can crowdsource knowledge about data assets.
- Custom Properties: Easily attach custom metadata to databases and tables, such as data steward or ETL process references.
- Search: Everything is indexed and searchable. Self-service data analytics, here we come!
Quickstart
You can try out a default version of Metamapper with sample data using Docker and Docker-Compose.
Clone the repository:
git clone git@github.com:getmetamapper/metamapper.git
From the repository root:
docker-compose -f docker-quickstart.yml up
Head to http://localhost:5555 to view the Metamapper UI. Try searching for "clickstream events" and see what happens!
Installation
Use our pre-baked Docker images. Detailed setup instructions can be found here: https://github.com/getmetamapper/metamapper-setup
Documentation
- https://www.metamapper.io/docs/
Supported datastores
Metamapper currently supports automatic crawling and indexing of these database management systems with plans to add more in the near future.
- Amazon Redshift
- AWS Athena
- AWS Glue
- Azure SQL Database
- Azure Synapse (formerly Azure DW)
- Google BigQuery
- Hive Metastore
- Microsoft SQL Server
- MySQL
- Oracle
- PostgreSQL
- Snowflake
Community / Get Involved
- Join the discussion on Discord (http://discuss.metamapper.io)
- Join the mailing list ([email protected])
- Give feedback through this Typeform survey