You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting-started/index.md
+30Lines changed: 30 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -21,3 +21,33 @@ functions in ClickHouse. The sample datasets include:
21
21
22
22
<!-- The following table is automatically generated at build time
23
23
by https://github.com/ClickHouse/clickhouse-docs/blob/main/scripts/autogenerate-table-of-contents.sh -->
24
+
| Page | Description |
25
+
|-----|-----|
26
+
|[New York Taxi Data](/getting-started/example-datasets/nyc-taxi)| Data for billions of taxi and for-hire vehicle (Uber, Lyft, etc.) trips originating in New York City since 2009 |
27
+
|[Terabyte Click Logs from Criteo](/getting-started/example-datasets/criteo)| A terabyte of Click Logs from Criteo |
28
+
|[WikiStat](/getting-started/example-datasets/wikistat)| Explore the WikiStat dataset containing 0.5 trillion records. |
29
+
|[TPC-DS (2012)](/getting-started/example-datasets/tpcds)| The TPC-DS benchmark data set and queries. |
30
+
|[Recipes Dataset](/getting-started/example-datasets/recipes)| The RecipeNLG dataset, containing 2.2 million recipes |
31
+
|[COVID-19 Open-Data](/getting-started/example-datasets/covid19)| COVID-19 Open-Data is a large, open-source database of COVID-19 epidemiological data and related factors like demographics, economics, and government responses |
32
+
|[NOAA Global Historical Climatology Network](/getting-started/example-datasets/noaa)| 2.5 billion rows of climate data for the last 120 yrs |
33
+
|[GitHub Events Dataset](/getting-started/example-datasets/github-events)| Dataset containing all events on GitHub from 2011 to Dec 6 2020, with a size of 3.1 billion records. |
34
+
|[Amazon Customer Review](/getting-started/example-datasets/amazon-reviews)| Over 150M customer reviews of Amazon products |
35
+
|[Brown University Benchmark](/getting-started/example-datasets/brown-benchmark)| A new analytical benchmark for machine-generated log data |
36
+
|[Writing Queries in ClickHouse using GitHub Data](/getting-started/example-datasets/github)| Dataset containing all of the commits and changes for the ClickHouse repository |
37
+
|[Analyzing Stack Overflow data with ClickHouse](/getting-started/example-datasets/stackoverflow)| Analyzing Stack Overflow data with ClickHouse |
38
+
|[AMPLab Big Data Benchmark](/getting-started/example-datasets/amplab-benchmark)| A benchmark dataset used for comparing the performance of data warehousing solutions. |
39
+
|[New York Public Library "What's on the Menu?" Dataset](/getting-started/example-datasets/menus)| Dataset containing 1.3 million records of historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices. |
40
+
|[Laion-400M dataset](/getting-started/example-datasets/laion-400m-dataset)| Dataset containing 400 million images with English image captions |
41
+
|[Star Schema Benchmark (SSB, 2009)](/getting-started/example-datasets/star-schema)| The Star Schema Benchmark (SSB) data set and queries |
42
+
|[The UK property prices dataset](/getting-started/example-datasets/uk-price-paid)| Learn how to use projections to improve the performance of queries that you run frequently using the UK property dataset, which contains data about prices paid for real-estate property in England and Wales |
43
+
|[Reddit comments dataset](/getting-started/example-datasets/reddit-comments)| Dataset containing publicly available comments on Reddit from December 2005 to March 2023 with over 14B rows of data in JSON format |
44
+
|[OnTime](/getting-started/example-datasets/ontime)| Dataset containing the on-time performance of airline flights |
45
+
|[Taiwan Historical Weather Datasets](/getting-started/example-datasets/tw-weather)| 131 million rows of weather observation data for the last 128 yrs |
46
+
|[Crowdsourced air traffic data from The OpenSky Network 2020](/getting-started/example-datasets/opensky)| The data in this dataset is derived and cleaned from the full OpenSky dataset to illustrate the development of air traffic during the COVID-19 pandemic. |
47
+
|[NYPD Complaint Data](/getting-started/example-datasets/nypd_complaint_data)| Ingest and query Tab Separated Value data in 5 steps |
48
+
|[TPC-H (1999)](/getting-started/example-datasets/tpch)| The TPC-H benchmark data set and queries. |
49
+
|[Foursquare places](/getting-started/example-datasets/foursquare-places)| Dataset with over 100 million records containing information about places on a map, such as shops, restaurants, parks, playgrounds, and monuments. |
50
+
|[YouTube dataset of dislikes](/getting-started/example-datasets/youtube-dislikes)| A collection is dislikes of YouTube videos. |
51
+
|[Geo Data using the Cell Tower Dataset](/getting-started/example-datasets/cell-towers)| Learn how to load OpenCelliD data into ClickHouse, connect Apache Superset to ClickHouse and build a dashboard based on data |
52
+
|[Environmental Sensors Data](/getting-started/example-datasets/environmental-sensors)| Over 20 billion records of data from Sensor.Community, a contributors-driven global sensor network that creates Open Environmental Data. |
53
+
|[Anonymized Web Analytics](/getting-started/example-datasets/metrica)| Dataset consisting of two tables containing anonymized web analytics data with hits and visits |
The OpenTelemetry project includes a [demo application](https://opentelemetry.io/docs/demo/). A maintained fork of this application with ClickHouse as a data source for logs and traces can be found [here](https://github.com/ClickHouse/opentelemetry-demo). The [official demo instructions](https://opentelemetry.io/docs/demo/docker-deployment/) can be followed to deploy this demo with docker. In addition to the [existing components](https://opentelemetry.io/docs/demo/collector-data-flow-dashboard/), an instance of ClickHouse will be deployed and used for the storage of logs and traces.
Copy file name to clipboardExpand all lines: docs/use-cases/observability/build-your-own/grafana.md
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -23,11 +23,11 @@ import Image from '@theme/IdealImage';
23
23
Grafana represents the preferred visualization tool for Observability data in ClickHouse. This is achieved using the official ClickHouse plugin for Grafana. Users can follow the installation instructions found [here](/integrations/grafana).
24
24
25
25
V4 of the plugin makes logs and traces a first-class citizen in a new query builder experience. This minimizes the need for SREs to write SQL queries and simplifies SQL-based Observability, moving the needle forward for this emerging paradigm.
26
-
Part of this has been placing Open Telemetry (OTel) at the core of the plugin, as we believe this will be the foundation of SQL-based Observability over the coming years and how data will be collected.
26
+
Part of this has been placing OpenTelemetry (OTel) at the core of the plugin, as we believe this will be the foundation of SQL-based Observability over the coming years and how data will be collected.
27
27
28
-
## Open Telemetry Integration {#open-telemetry-integration}
On configuring a Clickhouse datasource in Grafana, the plugin allows the users to specify a default database and table for logs and traces and whether these tables conform to the OTel schema. This allows the plugin to return the columns required for correct log and trace rendering in Grafana. If you've made changes to the default OTel schema and prefer to use your own column names, these can be specified. Usage of the default OTel column names for columns such as time (Timestamp), log level (SeverityText), or message body (Body) means no changes need to be made.
30
+
On configuring a ClickHouse datasource in Grafana, the plugin allows the user to specify a default database and table for logs and traces and whether these tables conform to the OTel schema. This allows the plugin to return the columns required for correct log and trace rendering in Grafana. If you've made changes to the default OTel schema and prefer to use your own column names, these can be specified. Usage of the default OTel column names for columns such as time (`Timestamp`), log level (`SeverityText`), or message body (`Body`) means no changes need to be made.
31
31
32
32
:::note HTTP or Native
33
33
Users can connect Grafana to ClickHouse over either the HTTP or Native protocol. The latter offers marginal performance advantages which are unlikely to be appreciable in the aggregation queries issued by Grafana users. Conversely, the HTTP protocol is typically simpler for users to proxy and introspect.
description: 'Landing page building your own observability stack'
7
+
---
8
+
9
+
This guide helps you build a custom observability stack using ClickHouse as the foundation. Learn how to design, implement, and optimize your observability solution for logs, metrics, and traces, with practical examples and best practices.
|[Introduction](/use-cases/observability/introduction)| This guide is designed for users looking to build their own observability solution using ClickHouse, focusing on logs and traces. |
14
+
|[Schema design](/use-cases/observability/schema-design)| Learn why users are recommended to create their own schema for logs and traces, along with some best practices for doing so. |
15
+
|[Managing data](/observability/managing-data)| Deployments of ClickHouse for observability invariably involve large datasets, which need to be managed. ClickHouse offers features to assist with data management. |
16
+
|[Integrating OpenTelemetry](/observability/integrating-opentelemetry)| Collecting and exporting logs and traces using OpenTelemetry with ClickHouse. |
17
+
|[Using Visualization Tools](/observability/grafana)| Learn how to use observability visualization tools for ClickHouse, including HyperDX and Grafana. |
18
+
|[Demo Application](/observability/demo-application)| Explore the OpenTelemetry demo application forked to work with ClickHouse for logs and traces. |
0 commit comments