Insights and Observations from re:Invent 2024 – Day 1

Monday, 2nd December 2024

Today began with browsing and familiarisation of the different venues that my sessions will be distributed across. When people say everything is so far away – they are not kidding. The scale is something that must be experienced.

Kicking off my first session of re:Invent, I learned about building secure network design for generative AI applications.

SEC327-R | Building secure network designs for generative AI applications

Starting with a hand-drawn architecture diagram, we explored the architecture implementation in stages of deployment and connectivity between services within a VPC context. The defence in depth strategy requires more than just networking and we uncovered some of the strategies covering Service Control Policies (SCP’s), Identity Access Management (IAM), Network ACL’s, Security-Groups, NAT Gateways, Network Firewalls, DNS Firewalls, Web Application Firewall (WAF). We touched on Key management for controlling and enforcing encryption and using policies to filter specific models.

Key takeaways

  1. Use a defense-in-depth strategy for your generative Al workloads
  2. Protect your AWS infrastructure, applications, and data
  3. Leverage private networking with AWS PrivateLink
  4. Encrypt your data both in transit and at rest
  5. Prevent data exfiltration with multiple layers of firewalling and identity controls
  6. Avoid a situation in which a failure of one protection affects the security posture of the entire generative Al stack by using multiple layers of defense
  7. Employ DDoS protections at the edge to achieve your resiliency objectives
  8. Use AWS WAF to protect web front ends that are using generative Al foundational models
  9. Apply geo blocking from locations not expected to access your applications

Played Games!

Due to the sheer distance between venues, I was not able to make my midday session on time. Instead, I met with some strangers, visited the Sports Forum again and played the AWS Builders Card Game with the Generative AI card expansion deck. The card game works by taking a journey from on-premises to the cloud and teaches concepts related to AWS Well-Architected Framework and foundational services relationships.

SEC402-R | Game Day - Winning the DDoS Game

In the afternoon, I attended a Game Day all about DDoS for a fictitious company for fun and found the challenge insightful. My team, made up of random people, did not win but we certainly learned along the way and had fun competing. Going to brush up on some skills with this workshop:

AWS WAF Workshop

AWS re:Invent Expo – Officially Open

To round out the day, I caught up with Mark Badenach for the opening of the Expo at the Venetian. The crowds were massive, and it was quite overwhelming – sensory overload. With such a big few days – I have hit the wall and need to rest so it is an early night for me tonight – I must be getting old!

Recent Announcements

Lastly, it would not be re:Invent without some announcements:

New Amazon EC2 P5en instances with NVIDIA H200 Tensor Core GPUs and EFAv3 networking

Amazon EC2 P5en instances deliver up to 3,200 Gbps network bandwidth with EFAv3 for accelerating deep learning, generative AI, and HPC workloads with unmatched efficiency.

VPC Lattice now includes TCP support with VPC Resources

With the launch of VPC Resources for Amazon VPC Lattice, you can now access all of your application dependencies through a VPC Lattice service network. You are able to connect to your application dependencies hosted in different VPCs, accounts, and on-premises using additional protocols, including TLS, HTTP, HTTPS, and now TCP. This new feature expands upon the existing HTTP-based services support, enabling you to share a wider range of resources across your organisation.

Amazon EC2 P5en instances, optimized for generative AI and HPC, are generally available

Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P5en instances, powered by the latest NVIDIA H200 Tensor Core GPUs. These instances deliver the highest performance in Amazon EC2 for deep learning and high-performance computing (HPC) applications. You can use Amazon EC2 P5en instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative AI applications.

Enjoyed this blog?

Share it with your network!

Move faster with confidence