Author: Xiaoguang Sun (Software Architect at Zhihu, TiKV Project Maintainer)

Transcreator: Ran Huang; Editor: Tom Dewan

Your organization is growing with each passing day; so is your data. More data brings more business opportunities, but it also begets higher storage costs. You want a better way to manage the cost? We want the same thing for our open source database, TiDB.

TiDB is a distributed SQL database designed for massive data. Our goal is to support large-scale datasets with a reasonable cost. At TiDB Hackathon 2020, we took a big step in that direction. We introduced a feature, the time…

Industry: Artificial Intelligence


  • Mingxing Qu (Architect at PatSnap Data Warehouse Team)
  • Liang Li (Senior DBA at PatSnap Data Warehouse Team)

Transcreator: Caitin Chen; Editor: Tom Dewan

PatSnap is a global patent search database that integrates 150 million+ patent data records, 170 million+ chemical structure data records, and thousands of records about financial news, scientific literature, market reports, and investment information. Our users can search, browse, and translate patents, and generate patent analysis reports. We help 10,000+ customers in 50+ countries make better innovation decisions.

As our businesses developed, our data size quickly grew. Previously, we used the Segment +…

Author: PingCAP

Transcreators: Ran Huang, Caitin Chen; Editor: Tom Dewan

Since it was released, TiDB 5.0 has been widely used in production across industries, including finance, internet and new economy, and logistics. It is well received by various users:

  • At 58 Finance and Anjuke, TiDB serves complex reads and join queries for data warehouse reporting. For multi-table join queries, TiDB 5.0 performs better than TiDB 4.0 by 90%.
  • NetEase Games tested TiDB 5.0 and found it was more stable than TiDB 4.0, with no noticeable jitter.
  • Autohome uses TiDB 5.0 for join and aggregation queries. TiDB 5.0’s …

PingCAP is proud to announce today that the company has achieved the International Organization for Standardization (ISO) 27001:2013 certification for TiDB Cloud. Following an extensive audit process, the certification was issued by British Standards Institution (BSI), an ANAB-accredited certification body headquartered in London.

ISO/IEC 27001:2013 is a globally recognized standard that sets out the policies and requirements for establishing, implementing, maintaining, and continually improving an information security management system (ISMS). …

Author: Hexi Lee (Software Engineer Intern at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

TiKV is a distributed key-value storage engine, featuring strong consistency and partition tolerance. It can act either as the storage engine for TiDB or as an independent transactional key-value database. Do you know what else it is capable of?

At TiDB Hackathon 2020, our team built a TiKV-based distributed POSIX file system, TiFS, which inherits the powerful features of TiKV and also taps into TiKV’s possibilities beyond data storage.

In this post, I’ll walk you through every detail of TiFS: how we came up with the…

Authors: Software Engineers at PingCAP

Transcreator: Ran Huang; Editor: Tom Dewan

Contributing to TiDB’s codebase is not easy, especially for newbies. As a distributed database, TiDB has multiple components and numerous tools, written in multiple languages including Go and Rust. Getting started with such a complicated system takes quite an effort.

So, to welcome newcomers to TiDB and make it easier for them to contribute to our community, we’ve developed a TiDB integrated development environment: TiDE. Created during TiDB Hackathon 2020, TiDE is a Visual Studio Code extension that makes developing TiDB a breeze…

Author: Wenbo Zhang (Linux Kernel Engineer of the EE team at PingCAP)

Transcreator: Charlotte Liu; Editor: Tom Dewan

In Linux Kernel vs. Memory Fragmentation (Part I), I concluded that grouping by migration types only delays memory fragmentation, but does not fundamentally solve it. As the memory fragmentation increases and it does not have enough contiguous physical memory, performance degrades.

Therefore, to mitigate the performance degradation, the Linux kernel community introduced memory compaction to the kernel.

In this post, I’ll explain the principle of memory compaction, how to view the fragmentation index, and how to quantify the latency overheads caused by…

Author: Ke’ao Yang (Software Engineer at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

At the heart of modern software is layered abstraction. In abstraction, each layer hides details that are not relevant to other layers and provides a simple, functioning interface.

However, because abstraction brings the extra overhead of system calls, its simplicity comes at the cost of performance. Thus, for performance-sensitive software such as databases, abstraction might bring unwanted consequences. How can we boost performance for these applications?

In this article, I’ll talk about the pitfalls of abstraction in the modern storage structure and how we tackled the problem…


  • Ruoxi Sun (Tech Lead of Analytical Computing Team at PingCAP)
  • Fei Xu (Software Engineer at PingCAP)

Editors: Tom Dewan, Caitin Chen

TiDB is a Hybrid Transaction/Analytical Processing (HTAP) database that can efficiently process analytical queries. However, when large amounts of data are involved, the CPU becomes the bottleneck for processing queries that include JOIN statements and/or aggregation functions.

At the same time, the GPU is rapidly gaining popularity in areas of scientific computing, AI, data processing, and so on. It outperforms the CPU by orders of magnitude in such areas. …


  • Yueyue Zhou (Product Expert at PingCAP)
  • Dan Su (Product Expert at PingCAP)

Transcreator: Ran Huang; Editor: Tom Dewan

Big data, in recent years, is not just a buzzword, but also a growing need for ambitious companies. With skyrocketing data scale and stringent requirements on data freshness, big data-related scenarios are becoming complicated and multi-dimensional. Thus, many companies use real-time data warehouses to meet their business demand.

But data warehouses are not the only option. An emerging category of databases, Hybrid Analytical/Transactional Processing (HTAP) databases, can serve you just as well as data warehouses, if not better. …


PingCAP is the team behind TiDB, an open source MySQL compatible NewSQL HTAP database. Official website: GitHub:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store