/* holds the size of maximum between msg size and cycle buffer, * aligned to cache line, * it is multiply by 2 for send and receive * with reference to number of flows and number of QPs */ ctx->buff_size = INC(BUFF_SIZE(ctx->size, ctx->cycle_buffer), ctx->cache_line_size) * 2 * num_of_qps_factor * user_param->flows;
65536 = 64Kb
generally, 16 pages
root cause: ulimit -l is 16 (default) in container
Python 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 17:50:39) [GCC Clang 10.0.0 ] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> torch.__version__ '1.8.0a0+f1a8a82'
TcpStore
1 2 3 4 5 6 7
python test/distributed/test_c10d.py
Python 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 17:50:39) [GCC Clang 10.0.0 ] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import torch.distributed as dist >>> server_store = dist.TCPStore("127.0.0.1", 18668, 1, True)
--class=... --master k8s://https://%s:%s --deploy-mode cluster/client --conf spark.kubernetes.namespace=default --conf spark.app.name=spark-pi SparkPi.jar (MainApplicationFile: MainFile is the path to a bundled JAR, Python, or R file of the application.)
SPARK_HOME/bin/spark-submit args
Spark-on-k8s-operator controller run the spark-submit scripts
A candidate must contact a majority of the cluster in order to be elected, which means that every committed entry must be present in at least one of those servers
Raft determines which of two logs is more up-to-date by comparing the index and term of the last entries in the logs
简单来说,在发起选举投票时,需要携带最新的 log 信息,包括 index 及 term;term 越大越新,如果 term 相同,则 log 的长度越长越新;这可以保证新选举出来的 leader 包含了之前所有 commited 的信息
This property controls the maximum size that the pool is allowed to reach, including both idle and in-use connections. Basically this value will determine the maximum number of actual connections to the database backend.
决定了最大 Connections 数 (idle and in-use)
minimumIdle = maximumPoolSize
需要维持的最小 idle Connections 数
validationTimeout = 5s
This property controls the maximum amount of time that a connection will be tested for aliveness.