Pick of the Week at NebulaGraph - Configuration recommendations for data import
Normally the weekly issue covers NebulaGraph Updates and Community Q&As. If something major happens, it will also be covered in the additional Events of the Week section.
Events of the Week
- Live talk: NebulaGraph in practice with WeChat
Graphs hold a promising prospect in areas such as social network recommendation, real-time computing, risk control, and security. How to store and query large-scale heterogeneous graph data efficiently with graph databases is a great challenge.
Most of the well-known graph databases are helpless in dealing with big data sets. For example, the community edition of Neo4j, which is widely used in the graph field, only provides single-replica services. And JanusGraph, while solving the storage problem of big data sets by external metadata management, KV storage, and indexing, has much-maligned performance issues.
How can Internet companies which are facing the challenge of big data storage and processing solve these problems with a graph database? In this live talk, Li Benli, a senior engineer in the WeChat team, shared his experience with us.
Previously Li has written an article in this regard, read the article here.
- New release: NebulaGraph 1.1.0 will be released next week
In this release, the dev team has greatly improved the stability and performance of NebulaGraph. There will also be some bug fixes. Stay tuned!
The updates of Nebula in the last week:
The range scan for string-type indexes is no longer supported. We can only use the
==condition in the
WHEREclause of a
LOOKUPstatement while filtering string-type indexes, and the conditions must match all the properties of the indexes. For more information, see PR #2283 and PR #2277.
Fixed an issue where stopping the meta service before the initialization of the job manager may cause meta exceptions. For more information, see PR #2332.
Optimized the logic of Raft. A delay is added after elections failed to ensure that there is only one election request at the same time. For more information, see PR #2305.
This week's topic is about suggestions for the Spark Writer configuration from @nicole, a community user.
Spark Writer Configuration Suggestions
Before using Spark Write to import data, we need to configure
@nicole recommends that we enrich the configuration file with comments, writing comments for all parameters in the file. Put parameters with default values, such as Spark-related parameters, into comments, and write notes to remind users to remove the comment signs to make any modification take effect.
For the field mapping configuration of tags and edges, @nicole wonders if we can add an option that could automatically map the fields in the source data and Neubla Graph that have the same names. For tags and edges with more than 50 properties, this option would save a lot of work.
Recommended for You
In this article, you'll learn about the implementation of NebulaGraph Exchange, a data import tool based on Spark, and how to import data with it.