/img/avatar-2.jpg

墨冊

ElasticSearch 維運 - Reindex

Reindex Refer to https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html Reindex: If you want to merge different concrete indices. Sometimes if the indices is not balance like too many tiny indices or can not shrink number of shards to suitable number. Brief introduction Create the new dest indices Call _reindex action Check reindex running status Recover the indices setting to the new dest indices Delete the old indices Example We want to merge qlog-fff-202105-000002, qlog-fff-202104-1 Run

ElasticSearch 維運 - Shrink Shard

Shrink the number of shards Brief introduction Move all indices data into the same data node and block writing Check data moving status util finish Call _shrink api and assign the new name and settings(included # of shards) Check the new indices creation status Set the replica back, and check the new one and old one whether the doc.count is the same or not. Delete old indices. (Be careful for

Spark Cheat Sheet for Scala/Python

Spark Example Read the parquet file scala> val param = spark.read.parquet("s3://file_path_you_put") Print the parquet file schema scala> param.printSchema() root |-- sha1: string (nullable = true) |-- label: string (nullable = true) |-- time: long (nullable = true) Print the parquet content scala> new_result.show() +--------------------+-----+----------+ | uuid|label| time| +--------------------+-----+----------+ |d8f9ba869c19f25cc...| Hell|1562112000| |f8e172cb34d620bbe...| |1562112000| |28eb0ec1e0d549a58...| PUMA|1562112000| |145760249908bb4f7...| PUMA|1562112000| |e5622270036303a86...| Hell|1562112000| +--------------------+-----+----------+ only showing top 20 rows Get the number of rows scala>