Kubernetes Peer Discovery
Note | this is follow up of my last post. This post is about porting the algorithm to kubernetes. |
In my last post, I described an algorithm to discover all instances of a service in a docker compose environment. While I use docker compose a lot for small projects and demos, I use kubernetes for bigger projects. So I tried to port the algorithm to kubernetes. In theory, this can be done by changing the code that gets the IPs of all peers.
Kubernetes Deployment
To start, I created a deployment with the pod running the code. By scaling it to 3 Replicas, multiple pods are available for connections. A service is not needed, as the pods will connect directly to each other and no connection from the outside or other deployments/pods is necessary.
apiVersion: apps/v1
kind: Deployment
metadata:
name: kubernetes-peers
labels:
app: kubernetes-peers
spec:
selector:
matchLabels:
app: kubernetes-peers
replicas: 3
template:
metadata:
labels:
app: kubernetes-peers
spec:
serviceAccountName: kubernetes-peers-sa
containers:
- name: kubernetes-peers
image: kubernetes-peers-image
ports:
- containerPort: 5000
As can be seen in the Listening above, only a basic deployment is needed. The two important settings are the app
label for selection and the containerPort
to allow incoming connections from the other peers.
Permissions
To get the IPs of the other peers, access to the pods data is needed. For this, a role can be created, that allows the access to the needed endpoint:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kubernetes-peers-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
When paired with a ServiceAccount
a RoleBinding
, this grants the pods of the deployment access to the kubernetes API pod endpoints needed.
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubernetes-peers-sa
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernetes-peers-rb
subjects:
- kind: ServiceAccount
name: kubernetes-peers-sa
roleRef:
kind: Role
name: kubernetes-peers-role
apiGroup: rbac.authorization.k8s.io
By setting the serviceAccountName
in the Deployment above, we bind the ServiceAccount with the granted permissions for the deployment.
Pods and Labels
My first Idea was to use the information from the deployment to identify the pods. Ideally this can be done without setting an environment variable or any other external reference. As the hostname of a pod is the name of the pod, this name can be used to retrieve the necessary information. My Idea was to get the name of the deployment from this. But only the name of the ReplicaSet
is included in this data. This would be sufficient, but the ReplicaSet
doesn’t contain the pods it controls. So another way to get the pods is needed, as both the Deployment and ReplicaSet
aren’t working.
As the app
label was set earlier, this can be used to identify the pods needed. By applying this label as filter, the requested pods can be listed. Only when scaling or restarting the Deployment
, there is chance that too many pods are listed, as additional ones are created or destroyed. But during this phase, the pods aren’t stable anyway, so this is acceptable in this use case.
Querying the IPs
As the rest of the application is written in rust, the kube.rs
crate can be used to request the needed information from the kubernetes api. For this, a instance of the Api
struct, as a base of operation. After that the value of the app
label can be extracted by querying the information of the own pod. This value can then be used to identify the pods that belong to the deployment and are the other peers. Lastly, duplicate IPs are filtered out to prevent redundant connections:
pub async fn get_peer_ips() -> Vec<IpAddr> {
tracing::info!("Getting peer IPs");
let pod_name = std::env::var("HOSTNAME").expect("HOSTNAME should be set");
let client = Client::try_default().await.unwrap();
let pods: Api<Pod> = Api::default_namespaced(client.clone());
// find the app label of the own pod
let own_pod = pods.get_metadata(&pod_name).await.unwrap();
let app_label = own_pod.metadata.labels.unwrap().get("app").unwrap().to_string();
// get all pods with app label
let pods = pods.list(&Default::default()).await.unwrap();
let pods = pods.items.iter().filter(|pod| {
pod.metadata.labels.as_ref().unwrap().get("app").unwrap() == &app_label
});
let mut ips: Vec<IpAddr> = vec![];
for pod in pods {
tracing::warn!("Pod: {:?}", pod.metadata.name);
let pod_status = match pod.status.as_ref() {
Some(status) => status,
None => {
tracing::error!("Pod status is None for pod: {:?}", pod.metadata.name);
continue;
}
};
let pod_ip = match pod_status.pod_ip.as_ref() {
Some(ip) => ip,
None => {
tracing::error!("Pod IP is None for pod: {:?}", pod.metadata.name);
continue;
}
};
match pod_ip.parse() {
Ok(ip) => ips.push(ip),
Err(e) => {
tracing::error!("Failed to parse Pod IP for pod {:?}: {}", pod.metadata.name, e);
continue;
}
};
}
//filter duplicate IPs
ips.sort_unstable();
ips.dedup();
tracing::warn!("Pod IPs: {:?}", ips);
return ips;
}
As shown by this port, the algorithm itself is independent of the infrastructure, as long as the IPs of the other peers can be requested. If you want the full code and experiment with the implementation yourself, cou can checkout the repository on GitLab.