Meet the New Container Networking Stack for Cloud Foundry

A few months ago, we shared a vision for container networking for Cloud Foundry. Today, we are excited to introduce you to netman-release – the new, pluggable container networking stack for Cloud Foundry.

As stated in the vision, the main problems that the container networking effort aims to solve are:

Security policies within Cloud Foundry are provided through Application Security Groups (ASGs) which require an application restart to apply policy. Simple, CIDR-based rules are too broad to indicate application intent.
All communication between containers must go through the GoRouter. This exposes internal applications by requiring them to have a public route or configuring ASGs to allow all internal communication. Neither of these is a particularly desirable solution.
Application identity is an important concern for security administrators. Not having direct addressing for containers requires an external firewall to trust all packets coming from Cloud Foundry instead of applying app-specific policies.
Third party networking stacks, including SDN stacks, cannot easily be plugged into the current architecture.

How are we solving these challenges?

With netman-release 0.6.0, there are three key sets of capabilities that help address the current challenges with container networking on Cloud Foundry. These capabilities are enabled through a “batteries included” pluggable network stack using Flannel, based on the CNI specification.

Direct addressing for containers

- All containers are connected to a single, system-wide routed (L3) IP network, backed by a VXLAN overlay network
- All containers have a single network interface and a single IP address on the overlay network

Granular policy enforcement and application identity

- - Communication between containers is enabled with granular application level policies that are applied dynamically without requiring app restarts. Policy configuration is as simple as:
```
cf allow-access SOURCE_APP DEST_APP --protocol tcp|udp --port [1-65535]
```

- Policy configuration is currently restricted to the CF admin user or OAuth clients with the network.admin scope
- Policy configuration can be done using simple CLI commands (currently requires a CLI plugin) or through an external API
- Policy enforcement uses VXLAN Group Policy Option with application identity encoded as a 16-bit field in the VXLAN header

Support for third-party plugins

- - Third-party CNI plugins can replace the included batteries to enable deeper integrations with other networking stacks

Wait, is the whole network stack changing under me?

While allowing policy-driven communication between containers is an important first step, we have not revamped the entire networking stack. We do plan to transition some of these and would love to hear your feedback around it.

The CF Router continues to reach app containers via NATed ports on Diego cells. For the batteries-included approach we plan to provide an option to move the router on to the overlay in the future to be able to apply policy for external communication.
Application Security Groups (ASGs) will continue to work as before. Eventually we envision policy configuration replacing ASGs by providing the same/better capabilities.

This is cool, how do I play with it?

The Github page for netman-release has all the information you need to get started with the new container networking features. Netman-release is a Garden-runC add-on that can be optionally included during Diego manifest generation.

The easiest way to get started is to follow the instructions to Deploy to BOSH-lite and then try out the Cats & Dogs example.

We have additional examples on our github page for service discovery. Also check out a real world integration running Akka on Cloud Foundry.

We also have instructions for 3rd Party Plugin Development. Plugin authors – we want to make our platform more pluggable, and your input can help us!

This is really cool, what’s next?

We believe that providing direct addressability to containers with policy enforcement is the first step to enabling a rich set of features. We have a few ideas on where to go from here, and again would appreciate your feedback on these.

Built-in service discovery: Today, an external service discovery mechanism must be deployed in order for application instances to discover each other. While we have provided examples using Eureka and amalgam8, do we need to do more?
Connecting to external services: We hear from many operators that the lack of application identity as it leaves Cloud Foundry is a big concern for security admins. Is providing a static NATed IP address enough? Should the IP range be configurable? Do we need an IP address per application?
Allow space developers to configure policies for their applications. Do we need a new role, or is space developer enough?

Conclusion

We believe that the new container networking features will enable new micro-service and clustering use cases and provide an elegant solution to commonly raised questions around securing communications in Cloud Foundry. We invite you to “kick the tires” and give us your valuable feedback, either through comments here, Github issues/feature requests or on the Cloud Foundry #container-networking Slack channel.