Extending Network Reachability of Vertex AI Pipelines
Vertex AI Pipelines and jobs use the Service Networking API to configure their networking. As such, they will run in a Google-owned VPC network in tenant project that is peered to the network in your project. This peering is subject to the following standard VPC Peering networking constraints;
- The Google-owned network has its own default route and will not import your network’s default route to the Internet
- Peered networks can be configured to export static and dynamic routes to each other. However, transitive peering is not allowed. Thus, Vertex AI jobs in the tenant project will not, by default, be able to reach other networks peered to your network.
Given these, it initially appears that you are unable to, for instance, apply firewall rules to your public endpoints that allow access to just your Vertex AI jobs. Additionally, these jobs are not immediately able to route through your network to reach other services that are peered with your network.
This article suggests configurations that will allow the Vertex AI Pipeline job, that is in a Google-managed VPC, to complete the following connections;
- Access endpoints on the Internet using a public source IPs that you control
- Access other peered endpoints by forwarding connections through your network
As a data scientist, this may be used as a reference to drive conversations about your networking requirements with your organization’s network administrator
Controlling the public outbound source IP
Some use cases call for allowing the pipeline job to connect to endpoints over the Internet. In order to configure firewall rules that allow these connections, an administrator needs to know the public source IP used by the pipeline.
Since the pipeline job runs in a Google-owned project that you have no access to, any outbound connections from there directly to the Internet are beyond your control.
While we recommend using IAM authentication to protect these endpoints, some users also need to deploy firewall rules that protect against Denial-of-Service attacks and comply with their organization’s security requirements.
The following workflows offer a way to control the outbound source IP for these connections.
Access Through NAT Instance
This configuration allows the pipeline to access endpoints on the Internet using the source IP of a NAT instance that you control.
- Create a NAT instance with one interface in your peered network and another in an outbound network. You may select any image that meets your requirements; however we will use a standard Debian instance to drive this discussion. NOTE: Since User Network1 refers to the NAT instance as the next hop to 220.127.116.11, we should have the NAT instance forward traffic to 18.104.22.168 to it’s other interface in User Network2 in order to prevent a routing loop in User Network1
- Ensure you enable IP forwarding on this instance and it has a public IP
- Visit the Go to the VPC Routes page for your project and add a static route to the target with the instance as the next hop. The example below uses 22.214.171.124 as the target
- Also visit the peering configuration and enable`Export custom routes` to ensure that this static route is exported to the tenant project. Here we see that the static route to 126.96.36.199 has been exported to service networking.
- Visit the UI for the instance and configure the instance to translate the source IPs on forwarded traffic to its own outbound interface. In the example below, interface en4 on the Debian instance is in the peered network and en5 is the outbound network.
sudo sysctl net.ipv4.conf.all.forwarding=1
sudo iptables --table nat --append POSTROUTING --out-interface ens5 -j MASQUERADE
- Also configure the instance with a static route to the endpoint with the outbound interface and gateway in User Network2 as the next hop
sudo ip route add 188.8.131.52/32 via 10.1.1.1 dev ens5
The following packet capture from the instance confirms that outbound connections from the pipeline’s 10.100.1.2 address are translated to the instance’s 10.1.1.2 address
Access Through Proxy
If you need to control the source IP the Vertex uses when connecting to a particular service you may also configure a service proxy in your network.
The Vertex AI job can reach the proxy in the directly peered network. Requests from the pipeline to external endpoints are relayed through the proxy and appear to originate from the Proxy’s public IP address
Similarly, connections from the Vertex AI job will appear to the public endpoint to originate from the proxy’s public IP
Access to other Peered Services
On occasion, the pipeline needs to access another service that consumes service networking.
We will use Memorystore for Redis to drive this discussion. Recalling that transitive peering is note supported, in the example below, connections from Vertex network cannot, by default, be forwarded through your network to the Redis network.
Collocation of Peered Services
The preferred option is to deploy all services that need to communicate with each other in the same Service Networking reservation. In the example below the Vertex AI job can reach Redis directly.
Please refer to the Reserving IP Ranges for Vertex AI discussion for guidance on extending this range to accommodate other services. Many services offer the option to either reserve a new range or consume an existing range. Please refer to the respective service documentation for guidance on how to use existing ranges
The workflows discussed above for assigning Vertex AI Pipelines public IPs by going through either a NAT instance or a proxy are easily adapted for access to this Redis example.
In the diagram below, User Network1 has a static route to Redis with the NAT instance as a next hop and exports this to the Vertex network. Connections from Vertex AI appear to Redis as if they originate from the NAT instance’s interface in User Network2.
This configuration might reduce the number of routes to manage in case you add new Service Networking IP ranges.
You can also replace the NAT instance with a routing instance between User Network1 and User Network2.
With this configuration, you need a static route to the Redis IP range in User Network1 as well as one to the Vertex AI IP range in User Network2. Both the static routes point to the routing instance as the next hop.
This configuration might be easier to troubleshoot than using a NAT instance. How you will have to work with your network administrator to update the routing tables in both networks if you bring up new services to maintain bidirectional connectivity
Rather than a routing instance, you can also configure Cloud VPN between User Network1 and User Network2. Here again you will need to export static routes both ways so that the respective service networks know to route through your network to reach each other. While you will still have to maintain the static route, this option relieves you of having to manage a routing instance yourself.
Similarly, if Vertex goes through a proxy to access Redis, those connections appear to originate from the proxy’s IP. The Redis instance can respond without having to know how to reach the Vertex AI network.
Vertex AI is one of a number of services that use the Service Networking API to connect the Google owned tenant project to your consumer project. Deployment guides for these services discuss how to connect these services to endpoints in your project.
This document extends the networking configuration discussion and presents options for connecting Vertex AI to other Service Networking consumers or endpoints in the Internet.