Here I have captured key highlights from ~2 hours of AWS re:Invent 2022 Keynote by Swami Sivasubramanian, AWS VP Data & ML.
New AWS services announced are preceded by New & bold font
Keynote by Swami Sivasubramanian, AWS VP Data & ML
- AWS AI/ML stack: AWS AI/ML stack
- Building an end-to-end Data Strategy: AWS Building end-to-end Data Strategy
- New Amazon Athena for Apache Spark: Get started with interactive analytics on Apache Spark in under a second
- New Redshift integration for Apache Spark: GA - check keynote by Adam
- Apache Spark runs 3x faster on AWS than open source Spark: Customers can run Apache Spark on: EMR, Glue, Sagemaker, Redshift & Athena!
- New Amazon DocumentDB Elastic Clusters: Fully managed solution to scale document workloads of virtually any size & scale
- Amazon SageMaker:
- AWS has removed the heavy lifting associated with ML, so that its accessible to many more developers
- Build, Train & Deploy ML models for any use case with fully managed infrastructure, tools & workflows
- End-to-end ML Journey: Build -> Train -> Deploy
- Amazon SageMaker now supports Geospatial ML:
- Making it easier to build, train & deploy ML models using Geospatial data
- Acquire geospatial data with just a few clicks
- Easily prepare geospatial data with built-in algos
- Speed model building with neural network models
- Amazon Redshift Multi-AZ: new Multi-AZ feature like RDS
- Delivering HA & Reliability to support mission-critical Analytics workloads
- New Trusted Language Extensions for PostgreSQL:
- PostgreSQL has hundreds of extensions* available. Customers demanded to have these as managed PostgreSQL Extensions for RDS/Aurora, however they come with security risk of super user access! Due to this AWS introduced Trusted Language Extensions for PostgreSQL:
- A new open-source project to support PostgreSQL extensions on Amazon RDS & Aurora
- Safely use extensions to meet your needs
- Install extensions without waiting for AWS certification
- Leverage popular programming languages
- What is PostgreSQL Extension?:
- PostgreSQL is designed to be easily extensible & PostgreSQL extensions add extra functionality to your database by modifying and enhancing how it does certain processes.
- What is PostgreSQL Extension?:
- Moreover, Postgres extensions can help with some of the limitations you may find with vanilla Postgres (such as working efficiently with time-series data) – without the hassle of switching to a whole new database.
- PostgreSQL has hundreds of extensions* available. Customers demanded to have these as managed PostgreSQL Extensions for RDS/Aurora, however they come with security risk of super user access! Due to this AWS introduced Trusted Language Extensions for PostgreSQL:
- New Amazon Guard Duty RDS protection:
- Protect your data in Aurora with Intelligent Threat Detection
- Leverages ML to accurately detect suspicious activity
- Delivers security findings enriched with contextual data
- Continuously monitors for potential threats with just one click
- AWS Glue Data Quality: Automatically measure, monitor & manage data quality in your data lake
- Centralized Access Controls for Redshift Data Sharing: Govern access to Redshift using AWS Lake Formation
- ML governance challenges:
- Creating custom policies is time consuming
- Capturing & sharing model information can lead to errors
- Gaining visibility into model performance is expensive
- To address these challenges:
- Amazon SageMaker ML Governance: Governance & Auditability for end-to-end ML development:
- Role Manager: Define custom user permissions in minutes
- Model Cards: Centralize model information & documentation
- Model Dashboard: Monitor model performance in one place
- Amazon SageMaker ML Governance: Governance & Auditability for end-to-end ML development:
- New Amazon DataZone:
- A Data management service to catalog, discover, share, & govern data across organization:
- AWS Lake formation, Amazon Athena, Amazon RedShift Data sharing, APIs to third-party sources
- Amazon DataZone is a portal
- Unlock data across organizational boundaries with built-in governance:
- A Data management service to catalog, discover, share, & govern data across organization:
- Amazon Aurora Zero-ETL integration with Amazon Redshift
- Amazon Redshift auto-copy from S3: Simplify & automate file ingestion into Redshift:
- Easily create & maintain simple data ingestion pipelines
- Continuously ingest data as soon as new files are created in S3
- Automate data loading without engineering resources
- Amazon AppFlow: Move data between SaaS services and data lakes & data warehouses
- Amazon AppFlow now offer 50+ connectors
- Amazon SageMaker Data Wrangler: Import data from SaaS services & third-party sources for ML
- Similarly, access 40+ new data sources from Amazon SageMaker Data Wrangler
- Education -> Program Update:
- AWS ML University now provides educator training (i.e. Train the Trainer): An AI & ML educator training program for community colleges & MSIs in US
- AWS AI & ML Scholarship Program announced last year, 2021 for under-served & under-represented students
- 150+ courses from AWS on learning Data & AI/ML
References
Watch now on YouTube: Keynote by Swami Sivasubramanian
Explore: AWS re:Invent 2022
Note: Copyrights for Images used in this blog belongs to AWS / Amazon
Comments powered by Disqus.