0%

docker security

https://docs.docker.com/engine/security/

docker security 总的来说,一个是使用了 kernel namespace 技术,为每个 container 创建了 process, network 等 namepsace,使得多个 container 不会有很大的相互影响

另外一个方面是使用了 control groups 技术,用于限制 container 所使用的各类资源

ensure that each container gets its fair share of memory, CPU, disk I/O

简单理解,比如 cpu 资源,cgroup 用于避免某个 container 不当使用(或者恶意 or 无意代码 bug)cpu,导致其他 container 没法正常使用 cpu 的场景

container root user

https://docs.docker.com/engine/security/userns-remap/

container 中不建议使用 root 用户执行进程,很大部分原因因为容器内的 uid gid 会映射到 host 上,举个例子,一旦容器内的进程逃逸到 host 上,那么它也有 root 用户的权限

虽然说容器内的进程逃逸,是很严重的安全问题,docker 社区会第一时间修复

docker

https://docs.docker.com/engine/reference/run/#foreground

-t : Allocate a pseudo-tty
-i : Keep STDIN open even if not attached

docker run -t

1
2
3
4
5
docker pull ubuntu:bionic-20200713

docker run -t --rm ubuntu:bionic-20200713 /bin/bash
root@9a7a115ff8d2:/# ls

启动容器无 -i 参数时,执行 ls 等命令无回显,执行 exit 命令无法退出 container terminal

docker run -i

1
2
3
4
docker run -i --rm ubuntu:bionic-20200713 /bin/bash
echo hello
hello
exit

启动容器无 -t 参数时,缺少常用的 terminal 功能,例如无当前登陆用户提示;但执行 ls 等命令正常有回显,且执行 exit 命令可退出

https://stackoverflow.com/questions/48368411/what-is-docker-run-it-flag

Without -t tag one can still interact with the container, but with it, you’ll have a nicer, more features terminal.

k8s

https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#container-v1-core

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Variables for interactive containers, these have very specialized use-cases (e.g. debugging)
// and shouldn't be used for general purpose containers.

// Whether this container should allocate a buffer for stdin in the container runtime. If this
// is not set, reads from stdin in the container will always result in EOF.
// Default is false.
// +optional
Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
// Whether the container runtime should close the stdin channel after it has been opened by
// a single attach. When stdin is true the stdin stream will remain open across multiple attach
// sessions. If stdinOnce is set to true, stdin is opened on container start, is empty until the
// first client attaches to stdin, and then remains open and accepts data until the client disconnects,
// at which time stdin is closed and remains closed until the container is restarted. If this
// flag is false, a container processes that reads from stdin will never receive an EOF.
// Default is false
// +optional
StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
// Whether this container should allocate a TTY for itself, also requires 'stdin' to be true.
// Default is false.
// +optional
TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
config := &runtimeapi.ContainerConfig{
Metadata: &runtimeapi.ContainerMetadata{
Name: container.Name,
Attempt: restartCountUint32,
},
Image: &runtimeapi.ImageSpec{Image: imageRef},
Command: command,
Args: args,
WorkingDir: container.WorkingDir,
Labels: newContainerLabels(container, pod),
Annotations: newContainerAnnotations(container, pod, restartCount, opts),
Devices: makeDevices(opts),
Mounts: m.makeMounts(opts, container),
LogPath: containerLogsPath,
Stdin: container.Stdin,
StdinOnce: container.StdinOnce,
Tty: container.TTY,
}

use conda env in docker

https://pythonspeed.com/articles/activate-conda-dockerfile/

https://docs.conda.io/projects/conda/en/latest/commands/run.html

1
conda run --no-capture-output -n my-python-env python --version

https://pkg.go.dev/database/sql/driver

sql/driver 中定义了 db driver 应实现的接口,其中明确了 ErrBadConn 的处理方式

  1. The Connector.Connect and Driver.Open methods should never return ErrBadConn.
  2. ErrBadConn should only be returned from Validator, SessionResetter, or a query method if the connection is already in an invalid (e.g. closed) state.

var ErrBadConn = errors.New("driver: bad connection")

ErrBadConn should be returned by a driver to signal to the sql package that a driver.Conn is in a bad state (such as the server having earlier closed the connection) and the sql package should retry on a new connection.

To prevent duplicate operations, ErrBadConn should NOT be returned if there’s a possibility that the database server might have performed the operation. Even if the server sends back an error, you shouldn’t return ErrBadConn.

简而言之,当 sql driver 返回 ErrBadConn 错误时,sql package 应使用 new connection 重试

https://pkg.go.dev/database/sql

golang native db connection pool

connection retry 机制结合 golang native sql Query/Exec 实现理解

https://github.com/golang/go/issues/11978

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// maxBadConnRetries is the number of maximum retries if the driver returns
// driver.ErrBadConn to signal a broken connection before forcing a new
// connection to be opened.
const maxBadConnRetries = 2

// QueryContext executes a query that returns rows, typically a SELECT.
// The args are for any placeholder parameters in the query.
func (db *DB) QueryContext(ctx context.Context, query string, args ...interface{}) (*Rows, error) {
var rows *Rows
var err error
for i := 0; i < maxBadConnRetries; i++ {
rows, err = db.query(ctx, query, args, cachedOrNewConn)
if err != driver.ErrBadConn {
break
}
}
if err == driver.ErrBadConn {
return db.query(ctx, query, args, alwaysNewConn)
}
return rows, err
}

// Query executes a query that returns rows, typically a SELECT.
// The args are for any placeholder parameters in the query.
func (db *DB) Query(query string, args ...interface{}) (*Rows, error) {
return db.QueryContext(context.Background(), query, args...)
}

Exec 实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// ExecContext executes a query without returning any rows.
// The args are for any placeholder parameters in the query.
func (db *DB) ExecContext(ctx context.Context, query string, args ...interface{}) (Result, error) {
var res Result
var err error
for i := 0; i < maxBadConnRetries; i++ {
res, err = db.exec(ctx, query, args, cachedOrNewConn)
if err != driver.ErrBadConn {
break
}
}
if err == driver.ErrBadConn {
return db.exec(ctx, query, args, alwaysNewConn)
}
return res, err
}

// Exec executes a query without returning any rows.
// The args are for any placeholder parameters in the query.
func (db *DB) Exec(query string, args ...interface{}) (Result, error) {
return db.ExecContext(context.Background(), query, args...)
}

综上 ErrBadConn 时,最多重试 2 次,使用 cached conn 或 new conn;超过重试次数,再尝试使用 new conn 1 次

psql BadConn

https://www.postgresql.org/docs/10/app-psql.html#id-1.9.4.18.7

2 if the connection to the server went bad and the session was not interactive

https://blog.golang.org/context#:~:text=A%20Context%20is%20safe%20for,to%20signal%20all%20of%20them.

A Context is safe for simultaneous use by multiple goroutines. Code can pass a single Context to any number of goroutines and cancel that Context to signal all of them.

project structure

1
2
3
4
5
6
7
8
9
10
11
.
├── cmd
│   └── command.go
├── go.mod
├── go.sum
├── main.go
└── pkg
└── run
└── long_run_cli.go

3 directories, 5 files

main.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
package main

import (
"context"
"os"
"os/signal"
"syscall"

"zs/toolkit-cli/cmd"
)

func main() {
c := make(chan os.Signal, 2)
signal.Notify(c, syscall.SIGINT, syscall.SIGTERM)

ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)

go func() {
select {
case <-c:
cancel()
}
}()

cmd.Execute(ctx)
}

command.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
package cmd

import (
"context"
"fmt"
"os"
"os/exec"

"github.com/spf13/cobra"

"zs/toolkit-cli/pkg/run"
)

var rootCmd = &cobra.Command{
Use: "long run cli",
Run: func(cmd *cobra.Command, args []string) {
cli := run.New()
err := cli.LongRun(cmd.Context())

if err != nil {
fmt.Printf("cli run err: %v\n", err)
if exitError, ok := err.(*exec.ExitError); ok {
fmt.Printf("exit code: %d\n", exitError.ExitCode())
}
}
},
}

func Execute(ctx context.Context) {
if err := rootCmd.ExecuteContext(ctx); err != nil {
fmt.Printf("err: %v\n", err)
os.Exit(1)
}
}

long_run_cli.go

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package run

import (
"context"
"os/exec"
)

type CLI struct {

}

func (cli CLI) LongRun(ctx context.Context) error {
cmd := exec.CommandContext(ctx, "sleep", "30")
return cmd.Run()
}

func New() *CLI {
return &CLI{}
}

https://pkg.go.dev/os/exec#CommandContext

The provided context is used to kill the process (by calling os.Process.Kill) if the context becomes done before the command completes on its own.

https://github.com/golang/go/issues/21135

proposal: os/exec: allow user of CommandContext to specify the kill signal when context is done

commandContext will trigger SIGKILL when the ctx is done …

log output format sample

1
2
INFO[2021-07-04 15:26:26]main.go:28 have a nice day                               zs=log
INFO[2021-07-04 15:26:26]main.go:29 zs gogogo zs=log

code sample

show timestamp

the meaning of [0000]

add common prefix

have a little overhead, add filename and line number

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
package main

import (
"path"
"runtime"
"strconv"

"github.com/sirupsen/logrus"
)

func main() {
var log = logrus.New()

formatter := &logrus.TextFormatter{
FullTimestamp: true,
TimestampFormat: "2006-01-02 15:04:05",
CallerPrettyfier: func(f *runtime.Frame) (string, string) {
_, filename := path.Split(f.File)
// do not log func name
return "", filename + ":" + strconv.Itoa(f.Line)
},
}
log.SetFormatter(formatter)
log.SetReportCaller(true)

contextLogger := log.WithField("zs", "log")

contextLogger.Info("have a nice day")
contextLogger.Infof("%s gogogo", "zs")
}

third-party formatter

https://github.com/sirupsen/logrus#formatters

log output format sample

1
2
[2021-07-04 15:50:26]  INFO log: have a nice day
[2021-07-04 15:50:26] INFO log: zs gogogo

code sample

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
package main

import (
"github.com/sirupsen/logrus"
prefixed "github.com/x-cray/logrus-prefixed-formatter"
)

func main() {
var log = logrus.New()

formatter := &prefixed.TextFormatter{
FullTimestamp: true,
TimestampFormat: "2006-01-02 15:04:05",
}
log.Formatter = formatter

contextLogger := log.WithField("prefix", "log")

contextLogger.Info("have a nice day")
contextLogger.Infof("%s gogogo", "zs")
}

as previous code show

1
contextLogger := log.WithField("prefix", "log")

u can prefix a log key and colon before the msg output

https://developer.nvidia.com/gpudirect

环境信息

  • Kernel: 3.10.0-514.44.5.10.h254.x86_64 (uname -r)
  • Nvidia Driver: 440.33.01 (nvidia-smi)
  • MLNX OFED: 4.3-1.0.1.0 (ofed_info)
  • Mellanox/nv_peer_memory: Tag 1.1-0

容器化安装 NVIDIA Driver 看起来会出现 lsmod | grep nvidia 能找到,然而 modinfo nvidia 会提示找不到 Module 的错误

需要修改 nv_peer_memory 代码库的构建脚本,workaround 上述问题

DIY nv_peer_memory 编译

准备空目录

1
2
mkdir -p /root/nv_peer_memory
cd /root/nv_peer_memory

NVIDIA Driver

https://us.download.nvidia.com/tesla/440.33.01/NVIDIA-Linux-x86_64-440.33.01.run

1
2
3
4
5
# 下载 `NVIDIA-Linux-x86_64-440.33.01.run`
curl -o NVIDIA-Linux-x86_64-440.33.01.run 'https://us.download.nvidia.com/tesla/440.33.01/NVIDIA-Linux-x86_64-440.33.01.run'

# 解压至当前目录
./NVIDIA-Linux-x86_64-440.33.01.run -x

nv_peer_memory

https://github.com/Mellanox/nv_peer_memory/tree/1.1-0

https://www.mellanox.com/products/GPUDirect-RDMA

1
2
curl -o nv_peer_memory-1.1-0.tar.gz 'https://github.com/Mellanox/nv_peer_memory/archive/1.1-0.tar.gz'
tar xzf nv_peer_memory-1.1-0.tar.gz

DIY 编译

1
cd nv_peer_memory-1.1-0

修改 Makefile 中的 nv_sources 为 NVIDIA Driver 源码位置

1
nv_sources=/root/nv_peer_memory/NVIDIA-Linux-x86_64-440.33.01/kernel

修改 create_nv.symvers.sh 中的 nvidia_mod 为主机上安装的 NVIDIA Driver .ko 位置,例如

1
nvidia_mod=/var/k8s/nvidia/drivers/nvidia.ko

编译

参考 nv_peer_memory README.md

1
2
3
./build_module.sh

rpmbuild --rebuild /tmp/nvidia_peer_memory-1.1-0.src.rpm

安装 rpm

1
rpm -ivh /root/rpmbuild/RPMS/x86_64/nvidia_peer_memory-1.1-0.x86_64.rpm

测试

1
lsmod | grep nv_peer_mem

NCCL_DEBUG=INFO,例如

NCCL version 2.4.8+cuda10.1

1
NCCL INFO Ring 00 : 3 -> 10 [send] via NET/IB/0/GDRDMA

Trick

  • nvidia_peer_memory 代码中的 create_nv.symvers.sh 可独立执行,由于容器化安装 NVIDIA Driver 场景,modinfo nvidia 会报找不到 mod 的错,可找一台直接在主机侧安装了 NVIDIA driver 的机器,bash -x create_nv.symvers.sh 确认执行过程,以及相关变量取值

  • 如下命令可显示 mod 对应的 ko 文件位置

1
2
$/sbin/modinfo -F filename -k 3.10.0-514.44.5.10.h142.x86_64 nvidia
/lib/modules/3.10.0-514.44.5.10.h142.x86_64/kernel/drivers/video/nvidia.ko

https://community.mellanox.com/s/article/in-between-ethernet-vlans-and-infiniband-pkeys

https://community.mellanox.com/s/article/howto-use-infiniband-pkey-membership-types-in-virtualization-environment--connectx-3--connectx-3-pro-x

https://community.mellanox.com/s/article/howto-configure-ipoib-networks-with-gateway-and-multiple-pkeys

https://community.mellanox.com/s/article/HowTo-Configure-SR-IOV-for-ConnectX-4-ConnectX-5-ConnectX-6-with-KVM-Ethernet

https://github.com/Mellanox/k8s-rdma-sriov-dev-plugin

https://github.com/mellanox/k8s-rdma-shared-dev-plugin

https://docs.openshift.com/container-platform/4.6/networking/hardware_networks/add-pod.html#add-pod

IOV: I/O Virtualization

Single Root I/O Virtualization (SR-IOV) network

https://docs.openshift.com/container-platform/4.6/networking/hardware_networks/about-sriov.html

https://github.com/k8snetworkplumbingwg/sriov-cni

https://docs.mellanox.com/display/MLNXOFEDv461000/Kubernetes%20Using%20SR-IOV

https://community.mellanox.com/s/article/kubernetes-ipoib-sriov-networking-with-connectx4-connectx5

graph LR
InitContainer --> TrainingContainer
InitContainer --> SidecarContainer

InitContainer and SidecarContainer act like system container and they are transparent to the TrainingContainer

TrainingJob(process) of user is running at TrainingContainer

we can do the init env action at InitContainer, such as download data, and the upload action can be done at SidecarContainer

however, there will be an engineering problem, that is, the file read permission problem. The best way is to make the InitC / SidecarC / TrainingC users (uid) the same

powered by mermaid

https://mermaid-js.github.io/mermaid/#/flowchart

https://theme-next.js.org/docs/tag-plugins/mermaid.html?highlight=mermaid

https://github.com/theme-next/hexo-theme-next/pull/649

Type size

https://golang.org/ref/spec#Size_and_alignment_guarantees

https://github.com/ardanlabs/gotraining-studyguide/blob/master/go/language/struct.go

1
2
3
4
5
type example struct {
flag bool
counter int16
pi float32
}

字节对齐系数 #pragma pack(n)

  • 成员对齐
  • 结构体对齐

对齐系数规则

  1. For a variable x of any type: unsafe.Alignof(x) is at least 1.
  2. For a variable x of struct type: unsafe.Alignof(x) is the largest of all the values unsafe.Alignof(x.f) for each field f of x, but at least 1.
  3. For a variable x of array type: unsafe.Alignof(x) is the same as the alignment of a variable of the array’s element type.

layout

  • bool(0)
  • int16(2)
  • float32(4)

8 bytes

https://eddycjy.gitbook.io/golang/di-1-ke-za-tan/go-memory-align

//TODO list

  1. https://medium.com/analytics-vidhya/lenet-with-tensorflow-a35da0d503df
  2. https://medium.com/@mgazar/lenet-5-in-9-lines-of-code-using-keras-ac99294c8086

https://www.tensorflow.org/api_docs/python/tf/pad

paddings is an integer tensor with shape [n, 2], where n is the rank of tensor.

each dimension D

  • paddings[D, 0]: add before tensor
  • paddings[D, 1]: add after tensor

https://www.tensorflow.org/api_docs/python/tf/expand_dims