Home AWS re:Invent 2022 Keynote by Swami Sivasubramanian
Post
Cancel

AWS re:Invent 2022 Keynote by Swami Sivasubramanian

Here I have captured key highlights from ~2 hours of AWS re:Invent 2022 Keynote by Swami Sivasubramanian, AWS VP Data & ML.

New AWS services announced are preceded by New & bold font

Keynote by Swami Sivasubramanian, AWS VP Data & ML

  • AWS AI/ML stack: AWS AI/ML stack AWS AI/ML stack
  • Building an end-to-end Data Strategy: AWS Building end-to-end Data Strategy AWS Building end-to-end Data Strategy
  • New Amazon Athena for Apache Spark: Get started with interactive analytics on Apache Spark in under a second
  • New Redshift integration for Apache Spark: GA - check keynote by Adam
  • Apache Spark runs 3x faster on AWS than open source Spark: Customers can run Apache Spark on: EMR, Glue, Sagemaker, Redshift & Athena!
  • New Amazon DocumentDB Elastic Clusters: Fully managed solution to scale document workloads of virtually any size & scale
  • Amazon SageMaker:
    • AWS has removed the heavy lifting associated with ML, so that its accessible to many more developers
    • Build, Train & Deploy ML models for any use case with fully managed infrastructure, tools & workflows
    • End-to-end ML Journey: Build -> Train -> Deploy
  • Amazon SageMaker now supports Geospatial ML:
    • Making it easier to build, train & deploy ML models using Geospatial data
    • Acquire geospatial data with just a few clicks
    • Easily prepare geospatial data with built-in algos
    • Speed model building with neural network models
  • Amazon Redshift Multi-AZ: new Multi-AZ feature like RDS
    • Delivering HA & Reliability to support mission-critical Analytics workloads
  • New Trusted Language Extensions for PostgreSQL:
    • PostgreSQL has hundreds of extensions* available. Customers demanded to have these as managed PostgreSQL Extensions for RDS/Aurora, however they come with security risk of super user access! Due to this AWS introduced Trusted Language Extensions for PostgreSQL:
      • A new open-source project to support PostgreSQL extensions on Amazon RDS & Aurora
      • Safely use extensions to meet your needs
      • Install extensions without waiting for AWS certification
      • Leverage popular programming languages
      • What is PostgreSQL Extension?:
        • PostgreSQL is designed to be easily extensible & PostgreSQL extensions add extra functionality to your database by modifying and enhancing how it does certain processes.
    • Moreover, Postgres extensions can help with some of the limitations you may find with vanilla Postgres (such as working efficiently with time-series data) – without the hassle of switching to a whole new database.
  • New Amazon Guard Duty RDS protection:
    • Protect your data in Aurora with Intelligent Threat Detection
    • Leverages ML to accurately detect suspicious activity
    • Delivers security findings enriched with contextual data
    • Continuously monitors for potential threats with just one click
  • AWS Glue Data Quality: Automatically measure, monitor & manage data quality in your data lake
  • Centralized Access Controls for Redshift Data Sharing: Govern access to Redshift using AWS Lake Formation
  • ML governance challenges:
    • Creating custom policies is time consuming
    • Capturing & sharing model information can lead to errors
    • Gaining visibility into model performance is expensive
    • To address these challenges:
      • Amazon SageMaker ML Governance: Governance & Auditability for end-to-end ML development:
        • Role Manager: Define custom user permissions in minutes
        • Model Cards: Centralize model information & documentation
        • Model Dashboard: Monitor model performance in one place
  • New Amazon DataZone:
    • A Data management service to catalog, discover, share, & govern data across organization:
      • AWS Lake formation, Amazon Athena, Amazon RedShift Data sharing, APIs to third-party sources
      • Amazon DataZone is a portal
    • Unlock data across organizational boundaries with built-in governance:
      • Data Producers: Teams that want to share data
      • Data Consumers: Teams that want to use data
      • Amazon DataZone provides unified environment or zone, where everyone in the organization from Data producers to Data consumers can share & access data in a governed manner
      • Amazon DataZone 1/4 Amazon DataZone 1/4

      • Amazon DataZone 2/4 Amazon DataZone 2/4

      • Amazon DataZone 3/4 Amazon DataZone 3/4

      • Amazon DataZone 4/4 Amazon DataZone 4/4
  • Amazon Aurora Zero-ETL integration with Amazon Redshift
  • Amazon Redshift auto-copy from S3: Simplify & automate file ingestion into Redshift:
    • Easily create & maintain simple data ingestion pipelines
    • Continuously ingest data as soon as new files are created in S3
    • Automate data loading without engineering resources
  • Amazon AppFlow: Move data between SaaS services and data lakes & data warehouses
    • Amazon AppFlow now offer 50+ connectors
  • Amazon SageMaker Data Wrangler: Import data from SaaS services & third-party sources for ML
    • Similarly, access 40+ new data sources from Amazon SageMaker Data Wrangler
  • Education -> Program Update:
    • AWS ML University now provides educator training (i.e. Train the Trainer): An AI & ML educator training program for community colleges & MSIs in US
    • AWS AI & ML Scholarship Program announced last year, 2021 for under-served & under-represented students
    • 150+ courses from AWS on learning Data & AI/ML

References

Note: Copyrights for Images used in this blog belongs to AWS / Amazon

This post is licensed under CC BY 4.0 by the author.

AWS re:Invent 2022 Keynote by Werner Vogels

Setup JDK, maven, Java Spring Boot app with IntelliJ Idea on Ubuntu / Linux

Comments powered by Disqus.