Skip to content

Conversation

@overstep123
Copy link

@overstep123 overstep123 commented Aug 25, 2021

  1. fix nameservice to get correct cpu and mem resource from cgroups.
  2. fix the KEY url in the nameservice Dockerfile.
  3. add pod anti affinity for broker for better high availability.(for example, if the two master broker are both in one node and the node is down,then there is a few time that the rocket cannot handle producing.)

Copy link
Contributor

@caigy caigy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also link the PR with related issue and rename the title starting with '[ISSUE xx]'

curl https://archive.apache.org/dist/rocketmq/${ROCKETMQ_VERSION}/rocketmq-all-${ROCKETMQ_VERSION}-bin-release.zip -o rocketmq.zip; \
curl https://archive.apache.org/dist/rocketmq/${ROCKETMQ_VERSION}/rocketmq-all-${ROCKETMQ_VERSION}-bin-release.zip.asc -o rocketmq.zip.asc; \
curl https://www.apache.org/dist/rocketmq/KEYS -o KEYS; \
curl https://downloads.apache.org/rocketmq/KEYS -o KEYS; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide the reason to change it? It seems that this modification is not related to the title

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the result of the old url:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is indeed a problem. I think you can use curl -L to solve the redirect problem, which is the same as that in rocketmq-docker repo.

Comment on lines 385 to 401
podAntiAffinity = &corev1.PodAntiAffinity{
RequiredDuringSchedulingIgnoredDuringExecution: []corev1.PodAffinityTerm{corev1.PodAffinityTerm{
TopologyKey: "kubernetes.io/hostname",
LabelSelector: &metav1.LabelSelector{
MatchLabels: ls,
},
}},
PreferredDuringSchedulingIgnoredDuringExecution: []corev1.WeightedPodAffinityTerm{corev1.WeightedPodAffinityTerm{
Weight: 100,
PodAffinityTerm: corev1.PodAffinityTerm{
TopologyKey: "kubernetes.io/hostname",
LabelSelector: &metav1.LabelSelector{
MatchLabels: labelsForBroker(broker.Name),
},
},
}},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes should be related to this PR. Also, podAntiAffinity feature is discussed in #38 with PR #75, you can join it and work together to implement this feature.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why the nodeAffinity is needed. But the podAntiAffinity is sure needed. Because I don't want to see that the whole cluster(maybe multi master borker) become unwritable when all the master broker are assigned to the same node and the node is down. This situation will be worse if the kubernetes can't new the master pod in time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'b better open another PR to do it, because this is obviously unrelated to this issue.

Comment on lines 57 to 58
system_memory_in_mb=$(($(cat /sys/fs/cgroup/memory/memory.limit_in_bytes)/1024/1024))
system_cpu_cores=$(($(cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us)/100000))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to check memory and cpu from cgroup are smaller than system memory and cpu, because resource limits may not be set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better,and I fixed it.Although i think it's not correct if anone use it in the kubernetes without memory limit.

system_memory_in_mb=$system_memory_in_mb_in_docker
fi
system_cpu_cores=`egrep -c 'processor([[:space:]]+):.*' /proc/cpuinfo`
system_cpu_cores_in_docker=$(($(cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us)/100000))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 100000 read from /sys/fs/cgroup/cpu/cpu.cfs_period_us ?

Copy link
Author

@overstep123 overstep123 Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right,and fixed

@caigy
Copy link
Contributor

caigy commented Sep 17, 2021

@overstep123 Pls link the PR to related issue and rename the title starting with '[ISSUE xx]'.

@overstep123 overstep123 changed the title fix cpu and mem in docker for Linux [Issue 84]fix cpu and mem in docker for Linux Sep 17, 2021
@duhenglucky duhenglucky merged commit 704ac40 into apache:master Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants