// peer is the representative of a remote raft node. Local raft node sends // messages to the remote through peer. // Each peer has two underlying mechanisms to send out a message: stream and // pipeline. // A stream is a receiver initialized long-polling connection, which // is always open to transfer messages. Besides general stream, peer also has // a optimized stream for sending msgApp since msgApp accounts for large part // of all messages. Only raft leader uses the optimized stream to send msgApp // to the remote follower node. // A pipeline is a series of http clients that send http requests to the remote. // It is only used when the stream has not been established.
stream 每 100 ms 会重新尝试 dial remote peer,如果出现 request sent was ignored (cluster ID mismatch: remote[remote member id]=X-Etcd-Cluster-ID in http header, local=local cluster id) 错误的话,那么这个错误日志的打印频率将会很高,需要及时处理
上述方法从 –initial-cluster-token and –initial-cluster 这个两个启动参数中生成 Cluster ID 和各个 Member ID
NewClusterFromURLsMap 这个方法中调用 NewMember 生成 Member ID
首先来看 NewMember 方法
1
func NewMember(name string, peerURLs types.URLs, clusterName string, now *time.Time) *Member
核心思路
1 2 3
b []byte: peerUrls + clusterName + time hash := sha1.Sum(b) memberID: binary.BigEndian.Uint64(hash[:8])
Member ID 根据 peerUrls / clusterName / current_time 的 sha1 sum 值,取其前 8 个 bytes,为 16 位的 hex 数
回到 NewClusterFromURLsMap 方法中的 NewMember(代码如下)可见最后一个参数为 nil,即不加入时间因素,因此 NewClusterFromURLsMap 生成的 Member ID 是固定的
1
m := NewMember(name, urls, token, nil)
Member Add 生成的 Member ID
直接从 server 端看起 —— etcdserver/api/v3rpc/member.go 中的 MemberAdd 方法
可见如下代码
1 2 3 4 5 6 7 8 9
urls, err := types.NewURLs(r.PeerURLs) if err != nil { returnnil, rpctypes.ErrGRPCMemberBadURLs } now := time.Now() m := membership.NewMember("", urls, "", &now) if err = cs.server.AddMember(ctx, *m); err != nil { returnnil, togRPCError(err) }
m := membership.NewMember(“”, urls, “”, &now) 加入了当前时间,因此 Member ID 是不确定的
总结
cluster ID 仅生成一次,此后不会变化
通过 etcd 启动参数生成 (initial-cluster) 的 Member ID 固定
通过 Member add 生成的 Member ID 不确定
Member add 的时候,没有传递 member 的 name,因此 member add 成功时,member list 出来的 member item,新加入的 member 其 name 为空,且没有 client url,因该 member 尚未 publish 其 client url 到集群中
ReleaseLockTo releases the locks, which has smaller index than the given index except the largest one among them. For example, if WAL is holding lock 1,2,3,4,5,6, ReleaseLockTo(4) will release lock 1,2 but keep 3. ReleaseLockTo(5) will release 1,2,3 but keep 4.
上周比较无语的一个事情是和一个异地团队联合调试,踩了各种坑,同时也填了各种坑。事情背景:我团队需集成 Ta 团队提供的一个 SDK 到产品中。事情过程:本来是个很简单的操作,然而由于该 SDK 他们也在开发中,版本升级没提前知会;一开始我使用的旧版本,输出结果总是不对;问主管找到他们的咨询人员,联系后被告知在开会,回有时间帮忙看看;一等等一上午,到下午两点多的时候,才说版本升级了,用 xx 版本的试试;用了 xx 版本之后,总算解决问题了;到此以为问题完了,才不是,简直噩梦的开始,后来验证过程中发现对方 SDK 中有几句错误代码,导致运行不正常,代码截图发他们的咨询人员看,对方还强调说他们没加这个代码,说再去确认一下;反正我是无语了,SDK 就你们给的,而且还在里面发现了这个代码,那这代码是谁加的呀。不管怎么说,这个集成任务完成了,不过后续和对方团队的联调,感觉还是困难重重。