How I leveraged FastAPI's dependency injection to reduce code by more than 30%
FastAPI offers a powerful dependency injection that empowers us to design simplified & streamlined applications and how I utilized it to reduce code
This article expects that you know what is Dependency Injection
in the FastAPI world & basic syntax. If you haven't yet explored or used it, I recommend following this guide by the original author of FastAPI.
They are incredibly useful when you need to:
Share logic effortlessly (use the same code logic over and over again).
Seamlessly share database connections.
Enforce security, authentication, role requirements, and more with ease.
And so many other amazing things...
All this, while keeping code repetition to a minimum!
I started using Dependency Injection
just for authentication. So I used it like this
from .models import (
Auth0APIUser,
RootAuth0User,
)
# Security flow to authenticate user using Password flow
header_username = APIKeyHeader(
name="username", scheme_name="username", auto_error=False
)
header_password = APIKeyHeader(
name="password", scheme_name="password", auto_error=False
)
# Security flow to authenticate user using Auth0
auth0_authentication = Auth0(
domain=env_config.auth0_domain,
api_audience=env_config.auth0_api_audience,
auth0user_model=RootAuth0User, # type: ignore
auto_error=False,
)
def user_authentication_security(
user: Annotated[RootAuth0User, Security(auth0_authentication.get_user)],
) -> Auth0APIUser:
"""FASTApi Security injection to perform user auth using auth0"""
return auth0_api_response_to_user(user.model_dump())
Next is its application in the API Endpoint function
@app.get("/schema/{schema}/tables/{table}")
async def _get_table_data(
schema: str,
table: str,
user: Annotated[Auth0APIUser, Depends(user_authentication_security)]
) -> dict:
# checking if users has access to schema
schema_access = await check_schema_access(schema, user)
if schema_access is False:
logger.warning(
f"{user_info.email_id} is not authorized to access {schema}",
extra=logger_properties,
)
raise HTTPException(
status.HTTP_403_FORBIDDEN,
detail=f"{user_info.email_id} is not authorized to access {schema}",
)
# checking if users has access to table
table_access = await check_table_access(schema, table, user)
if table_access is False:
logger.warning(
f"{user_info.email_id} is not authorized to access {table} under {schema}",
extra=logger_properties,
)
raise HTTPException(
status.HTTP_403_FORBIDDEN,
detail=f"{user_info.email_id} is not authorized to access {table} under {schema}",
)
# checking if table is present or not
# checking if requested dataset is present
current_dataset = await datasetDetailsV2.find_one(
datasetDetailsV2.dataset == dataset
)
if not current_dataset:
raise HTTPException(
status.HTTP_404_NOT_FOUND,
detail=f"dataset: {dataset} not found",
)
# checking if requested table is present in dataset
complete_table_name = f"{current_dataset.destination_metadata.unity_catalog}.{current_dataset.dataset}.{table}"
if complete_table_name not in current_dataset.delta_share_compatible_table:
raise HTTPException(
status.HTTP_404_NOT_FOUND,
detail=f"table: {table} not found in dataset: {dataset}",
)
# From here onwards comes the core logic of above endpoint logic
...
The code above is practically shouting, "I need some help! 😫". This are several issues:
A lot of other code is being used in the function apart from the core logic. This violates the Single Responsibility Principle (SRP) of SOLID Principal
If I need to check if a
Table
is present or if aUser
has the right access in another endpoint function, I will end up having to copy and paste the same logic there too.What if I need to change the authentication and authorization logic in the future? It would make testing a lot more complicated too.
Also, long functions just look too ugly 🤮
We can improve this by removing things like authentication, authorization, verification, etc. and all the non-core logic from the API endpoint function into a separate dependency function. We can even use the nested dependency function to comply with Don’t Repeat Yourself (DRY).
from .models import (
Auth0APIUser,
RootAuth0User,
)
# Security flow to authenticate user using Password flow
header_username = APIKeyHeader(
name="username", scheme_name="username", auto_error=False
)
header_password = APIKeyHeader(
name="password", scheme_name="password", auto_error=False
)
# Security flow to authenticate user using Auth0
auth0_authentication = Auth0(
domain=env_config.auth0_domain,
api_audience=env_config.auth0_api_audience,
auth0user_model=RootAuth0User, # type: ignore
auto_error=False,
)
def user_authentication_security(
user: Annotated[RootAuth0User, Security(auth0_authentication.get_user)],
) -> Auth0APIUser:
"""FASTApi Security injection to perform user auth using auth0"""
return auth0_api_response_to_user(user.model_dump())
async def dataset_table_availability_verification(dataset: str, table: str):
"""FASTApi dependency to check if requested dataset and table is present"""
# checking if requested dataset is present
current_dataset = await datasetDetailsV2.find_one(
datasetDetailsV2.dataset == dataset
)
if not current_dataset:
raise HTTPException(
status.HTTP_404_NOT_FOUND,
detail=f"dataset: {dataset} not found",
)
# checking if requested table is present in dataset
complete_table_name = f"{current_dataset.destination_metadata.unity_catalog}.{current_dataset.dataset}.{table}"
if complete_table_name not in current_dataset.delta_share_compatible_table:
raise HTTPException(
status.HTTP_404_NOT_FOUND,
detail=f"table: {table} not found in dataset: {dataset}",
)
return current_dataset
# 1st Level dependancy
async def role_verification(
user: Annotated[Auth0APIUser, Depends(get_user)], claim: str
) -> userInfoV2:
# getting current user by matching auth0 api user id
current_user = await userInfoV2.find_one(userInfoV2.user_id == user.user_id)
if not current_user:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="user not found"
)
# checking if requested role is present
user_rbac = RBACPolicyClient(user_policy=current_user.model_dump())
try:
user_rbac.enforce_claim(policy_level=PolicyLevel.role, claim=claim)
except RBACPolicyViolation as e:
logger.warning(f"{user.primary_email} not approved for {claim}")
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail=f"{user.primary_email} don't have role: {claim}",
) from e
# 1st Level dependancy
async def scope_verification(
user: Annotated[Auth0APIUser, Depends(get_user)], claim: str
) -> userInfoV2:
# getting current user by matching auth0 api user id
current_user = await userInfoV2.find_one(userInfoV2.user_id == user.user_id)
if not current_user:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="user not found"
)
user_rbac = RBACPolicyClient(user_policy=current_user.model_dump())
try:
user_rbac.enforce_claim(policy_level=PolicyLevel.scope, claim=claim)
except RBACPolicyViolation as e:
logger.warning(f"{user.primary_email} not approved for {claim}")
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail=f"{user.primary_email} don't have scope: {claim}",
) from e
return current_user
# 2nd Level dependency
async def write_table_scope_dependency(
user: Annotated[Auth0APIUser, Depends(get_user)],
dataset: str,
table: str,
) -> userInfoV2:
scope_condition = f"{dataset}:{table}::write"
return await scope_verification(user=user, claim=scope_condition)
# 2nd Level dependency
async def read_table_scope_dependency(
user: Annotated[Auth0APIUser, Depends(get_user)],
dataset: str,
table: str,
) -> userInfoV2:
scope_condition = f"{dataset}:{table}::read" # read scope will present by default
return await scope_verification(user=user, claim=scope_condition)
# 2nd Level dependency
async def engineering_admin_role_dependency(
user: Annotated[Auth0APIUser, Depends(get_user)],
) -> userInfoV2:
return await role_verification(user=user, claim="management:engineering::admin")
Note that I've reused the 1st level dependency
in several 2nd level dependencies
, following the DRY principle. Up next, we'll see how it's applied in an API Endpoint function.
# EG 1. Here I am using authorization as dependancy rather in the same function
@app.get("/user", dependencies=[Depends(engineering_admin_role_dependency)])
async def _list_users() -> list[userInfoV2]:
"""Get the list of all users along with their access"""
return await userInfoV2.find_all().to_list()
# EG 2. Similar to above, I used authorization as dependancy
@app.patch(
"/dataset",
status_code=status.HTTP_201_CREATED,
dependencies=[Depends(engineering_admin_role_dependency)],
)
async def _update_dataset(details: DatasetInput):
"""Update existing dataset into the system"""
# checking if dataset is present
current_dataset = await datasetDetailsV2.find_one(
datasetDetailsV2.dataset == details.dataset
)
if not current_dataset:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="dataset not present",
)
# Core logic will come here
# EG 3. Using multiple dependancy to make code more clearner
@router_v2.get(
"/dataset/{dataset}/table/{table}",
dependencies=[Depends(read_table_scope_dependency)],
response_class=UJSONResponse,
)
async def _get_table_data(
dataset: str,
table: str,
current_dataset: Annotated[
datasetDetailsV2, Depends(dataset_table_availability_verification)
],
) -> list[dict]:
"""get delta table from desired dataset & table"""
# Core logic to read table data will come here
The key takeaways we can gather from the above are:
Keep things tidy by separating dependencies like auth and verification from the endpoint function. This makes the code look cleaner.
Use multilevel dependency functions to reuse code that repeats.
Testing becomes easier since we can create dedicated tests focusing on the endpoint function and other tests focusing on the dependency function.
Other endpoints with the same needs can easily reuse the same dependency injections.
All of this helped cut down the lines of code, streamline the logic, and generally make the code easier to read.
But there's still room for improvement. Some logic doesn't really belong to the core logic. By building on the Multilevel Dependency Injection
, we can further optimize it as follows:
# 1st Level dependency
async def user_presence_dependency(
request: Request,
user: Annotated[Auth0APIUser, Depends(get_user)],
) -> UserInfoV2:
current_user_df = pl.read_database(
current_user_filter(user.user_id), request.app.state.cursor
)
if current_user_df.is_empty():
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"{user.primary_email} not found",
)
return UserInfoV2.from_polars(current_user_df)
async def dataset_presence_dependency(
request: Request, dataset: str,
) -> DatasetInfoV2:
current_dataset_df = pl.read_database(
current_dataset_filter(dataset), request.app.state.cursor
)
if current_dataset_df.is_empty():
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"{dataset} not found",
)
return DatasetInfoV2.from_polars(current_dataset_df)
# 2nd Level dependency
async def table_presence_dependency(
dataset: Annotated[DatasetInfoV2, Depends(dataset_presence_dependency)],
table: str,
) -> str:
if dataset.table is None or table not in dataset.table:
raise HTTPException(
status.HTTP_404_NOT_FOUND,
detail=f"table: {table} not found in dataset: {dataset.dataset}",
)
return table
# 3rd Level dependency
async def read_table_scope_dependency(
user: Annotated[UserInfoV2, Depends(user_presence_dependency)],
dataset: str,
table: str,
) -> UserInfoV2:
scope_condition = f"{dataset}:{table}::read" # read scope will present by default
return await scope_verification(user=user, claim=scope_condition)
async def engineering_admin_role_dependency(
user: Annotated[UserInfoV2, Depends(user_presence_dependency)],
) -> UserInfoV2:
return await role_verification(user=user, claim=FixedString.ENGINEERING_ADMIN.value)
Next is its application in the API Endpoint function
@router_v2.patch(
"/dataset/{dataset}",
status_code=status.HTTP_201_CREATED,
dependencies=[Depends(engineering_admin_role_dependency)],
)
async def _update_dataset(
dataset: Annotated[DatasetInfoV2, Depends(dataset_presence_dependency)],
) -> dict[str, str]:
"""Update the existing dataset details"""
# Core logic will come here
async def _read_data(
dataset: Annotated[DatasetInfoV2, Depends(dataset_presence_dependency)],
table: Annotated[str, Depends(table_presence_dependency)],
filter_query: Annotated[FilterQueryParams, Query()],
request: Request,
) -> ORJSONResponse:
"""Get data from desired dataset & table"""
# Core logic will come here
The key takeaways we can gather from the above are:
The code now looks even cleaner. The API endpoint function now includes only the code related to itself and nothing else.
You might think that adding more dependency functions means adding more lines of code, but these are reusable dependencies. When you have many endpoints with the same requirements, the benefits really add up!
In this article, I have used 2-3 endpoints for example. But my actual project has upwards of 30 endpoints. Following were the impact.
They are grouped by categories. Ideally, all the endpoints in a category will have common requirements like authentication methods. I simply used the security dependency injection at the API Router definition & all the endpoint functions referring to it don’t need to use it again.
Even for the endpoints which have additional requirements, I can just use the respective dependency.
That's why overall I was able to reduce the code in my project by more than 30%
Apart from code optimization, if you closely look at the endpoints having either path or query parameters, using dependencies becomes a lot more readable. Now anyone knows what the requirements are for that specific parameter.