Manage Your Amazon S3 Objects with Amazon S3 Metadata!

9 hours ago3 min read

Written by Minhyeok Cha

How do you manage your Amazon S3 objects? Do you search for them directly in the console? Use CLI or SDK? Or maybe you rely on Glue crawling? Recently, my company noticed that S3 costs were gradually piling up, so I started looking for ways to reduce them.

Initially, I thought, "We can just move unused data to Glacier, and that’s it." However, managing the massive amount of data accumulated over about six years in a single bucket turned out to be a bit tricky. That’s when I noticed the "table bucket" feature and thought, “Why not give the relatively new S3 Metadata a try?” Fortunately, it worked out well, and I’d like to share my experience.

Table od Contents

What is Amazon S3 Metadata?
What is AWS Lake Formation?
Demo
S3 Cost Optimization Strategy
Conclusion

What is Amazon S3 Metadata?

amazon s3 metadata image — *(Source: AWS)*

You can find an introduction to Amazon S3 Metadata in an article I previously wrote, titled A Summary of Key Announcements from AWS re:Invent in 10 Minutes.

In that article, I mentioned that S3 Metadata can be integrated with AWS Glue Data Catalog. However, in this post, I’ll explore using AWS Lake Formation instead. Initially, I planned to use AWS Glue’s crawling feature, but decided to experiment with the officially released table bucket and Amazon S3 Metadata, which came out earlier this year.

What is AWS Lake Formation?

So, what exactly is AWS Lake Formation? AWS Lake Formation simplifies and automates the complex and time-consuming tasks involved in building a data lake. These tasks include collecting, cleaning, moving, cataloging data, and ensuring secure access for analytics and machine learning.

It also provides its own permission management model based on AWS Identity and Access Management (IAM).

This centralized permission management model allows for fine-grained access control to the data lake through a simple grant/revoke mechanism. Permissions in AWS Lake Formation can be applied at the table and column levels for all datasets in the data lake. Services integrated with this permission management include AWS Glue, Amazon Athena, Amazon Redshift Spectrum, and Amazon QuickSight. However, our primary goal is to access S3 objects for querying without crawling, so we’ll be using Lake Formation mainly as a connection pathway.

Demo

Since my company account has restricted permissions, this demo will be conducted using a test account.


💡 Table Buckets and Amazon S3 Metadata are only available in the Ohio and Northern Virginia regions.

Step 1: Create an S3 Table Bucket

Step 2: Generate Metadata for the S3 Bucket to Test

That completes the connection between S3 and the table bucket.

Step 3: Check with Amazon Athena

However, if you try accessing Athena without cataloging, nothing will show up. In fact, you need to create a catalog through AWS Glue. Fortunately, a new feature in Lake Formation now allows for automatic alignment of S3 tables, making the setup process smoother.

Step 4: Enable S3 Table Integration in AWS Lake Formation

When integrating, make sure to specify a role with S3 access permissions.

Once the integration is successful, the catalog will be displayed as shown below. Go into the catalog and proceed with policy settings.

In the Permissions section, click Grant to continue.

If you followed the steps correctly, go to Athena to check if the S3 data appears as expected.

Step 5: Successful Amazon Athena Query!

The data appeared without using AWS Glue, and the query executed successfully.

S3 Cost Optimization Strategy

The optimization process was straightforward. I created queries as shown below, downloaded the result as a CSV, and used the CLI to move the objects identified by the query to the Glacier storage class.

S3 Lifecycle Management

S3 Cost Optimization Strategy image 2_lifecycle management

Following that, I configured S3 Lifecycle policies to automatically move data to Glacier over time.

Conclusion

I decided to try out AWS’s new features and finally got around to it in March 2025. I had heard countless times about S3 cost optimization, but trying it out myself instead of relying on consulting felt quite refreshing.

For those who haven’t managed their S3 buckets before, I think this new method is definitely worth considering. It’s simpler to use than setting up Glue, which I found particularly appealing. However, I did find AWS Lake Formation’s setup a bit tricky initially. Still, if you need to manage data in your buckets, it might be worth giving it a try.

✅ Note: Deleting S3 table buckets can only be done via CLI or SDK, so keep that in mind.

스마일샤크 고객사례

Quantit (퀀팃)

퀀팃이 클라우드 기반 자동화로 금융 투자의 새로운 길을 열 수 있었던 이유는?

By Solutions

By Industry

스마일샤크 고객사례

맑은소프트

맑은소프트 LMS 시스템의 안정성과 성능을 동시에 강화할 수 있었던 이유는?

Case Studies

Insights

News Center

스마일샤크 소식

스마일샤크,
AWS Premier Tier Services 파트너 인증 취득

About us

Experience

Brand

Manage Your Amazon S3 Objects with Amazon S3 Metadata!

What is Amazon S3 Metadata?

What is AWS Lake Formation?