Consider mkdir -p within docker-server.sh 

<h3> Overview </h3> 
When using the Liberty image, we've ran into a rare scenario which causes `CrashLoopBackOff` behaviour within our workloads, and requires the pod in question to be killed and re-created.

This looks to be related to the use of no args [`mkdir` within the Liberty image's `docker-server.sh` script](https://github.com/WASdev/ci.docker/blob/d0bafe089f3d6b712968e25021ad3b26b64ad3bd/ga/latest/kernel/helpers/runtime/docker-server.sh#L64), which affects pod/container restart scenarios.

<h3> Container configuration </h3>  

- `emptyDir` volume, mounted at `/tmp`

<h3> Scenario </h3> 
Our K8s node temporarily became unavailable. During this time, the scheduler restarted our Liberty-based container twice.

The container logs here were: 
```
  |   | 2024-10-24 02:38:30.574 | Found mounted TLS certificates, generating keystore |  
  |   | 2024-10-24 02:38:44.892 | Found mounted TLS certificates, generating keystore |  
  |   | 2024-10-24 02:38:44.952 | mkdir: cannot create directory ‘/tmp/certs’: File exists |  
  |   | 2024-10-24 02:39:39.972 | Found mounted TLS certificates, generating keystore |  
  |   | 2024-10-24 02:39:40.233 | mkdir: cannot create directory ‘/tmp/certs’: File exists

```

On the first restart, the `/tmp/certs` directory would have been created. However, due to complications with node unavailability, it seems that the [following line to clean up the `/tmp/certs` directory was never executed](https://github.com/WASdev/ci.docker/blob/d0bafe089f3d6b712968e25021ad3b26b64ad3bd/ga/latest/kernel/helpers/runtime/docker-server.sh#L73).

Upon the second restart, the`/tmp/certs` directory would already exist, due to the our use of an `emptyDir` volume mount at `/tmp`. The the pod containing the workload was still persisted on the same node, and was not moved to another node, so the `emptyDir` was not cleared between executions.

[From the K8s docs](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir), on `emptyDir`: 

> When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.

This led to `CrashLoopBackOff` behaviour until the pod was manually killed, and a new pod was created.

<h3> Suggestion </h3>

To prevent the above scenario from occurring in workloads with similar configurations, would it make sense to update `docker-server.sh` to call `mkdir /tmp/certs` with `mkdir -p /tmp/certs`? 

It looks as though this may be a bug, as [`mkdir -p` was called within other areas of the `docker-server.sh` script](https://github.com/WASdev/ci.docker/blob/d0bafe089f3d6b712968e25021ad3b26b64ad3bd/ga/latest/kernel/helpers/runtime/docker-server.sh#L32).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider mkdir -p within docker-server.sh #645

Overview

Container configuration

Scenario

Suggestion

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Consider mkdir -p within docker-server.sh #645

Description

Overview

Container configuration

Scenario

Suggestion

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions