Building a Serverless Code Review Assistant with AWS Bedrock and GitLab

TL;DR: We built a serverless code review assistant that integrates AWS Bedrock with GitLab, automating code reviews while maintaining detailed metrics. The system evolved from a simple pipeline-based approach to a robust serverless architecture, significantly improving our development workflow.

The Challenge

Our team faced recurring challenges with code review oversights leading to production issues. Despite having multiple approvals, templates, and conditions in place, we needed a more robust automated solution. The key requirements were:

System Evolution

Initial Implementation

Our first approach used GitLab CI pipelines directly:

Initial Pipeline Setup

Initial pipeline-based implementation with direct integrations

This setup quickly revealed several limitations:

Enhanced Architecture

Learning from these challenges, we redesigned the system with a serverless approach. This is a CDK stack that builds a GitLab webhook receiver with API Gateway and Route53 custom domain setup. The main logic runs through Step Functions which coordinates three Lambda functions for different tasks: merge request reviews, emoji processing, and comment handling. The data is stored in DynamoDB tables with TTL capabilities and stream processing for automated cleanups. It uses AWS best practices like API key authentication, CloudWatch monitoring, and proper secret management through Secrets Manager. The stack is nicely organized into separate constructs making it modular and maintainable.

New Architecture Setup

Enhanced serverless architecture with centralized management

Core Components

1. AWS Integration Layer

At the heart of our system is a robust AWS helper class managing all cloud service interactions:


class AwsHelper:
    def __init__(self, dynamodb_table_name=None):
        self.region_name = os.environ["AWS_REGION"]
        self.logger = logging.getLogger()
        self.logger.setLevel(logging.INFO)
        if dynamodb_table_name:
            dynamodb = self.get_dynamodb_client()
            self.table = dynamodb.Table(dynamodb_table_name)

    def get_bedrock_client(self):
        return boto3.client(service_name="bedrock", region_name=self.region_name)

    def get_bedrock_runtime_client(self):
        return boto3.client(
            service_name="bedrock-runtime", region_name=self.region_name
        )
2. Intelligent Review System

The MrReviewer class handles the core review logic with adaptive depth based on change size:


def generate_review_prompt(self, changes):
    total_changed_lines = sum(
        len([line for line in change.get("diff", "").split("\n")
             if line.startswith("+") or line.startswith("-")])
        for change in changes
    )

    if total_changed_lines <= 5:
        base_instruction = """Please provide a very brief review of the following small code change.
        Your review should be no more than 2-3 sentences long."""
    elif total_changed_lines <= 20:
        base_instruction = """Please provide a concise review of the following code changes.
        Your review should be about 4-5 sentences long."""
3. Cost Management

Precise cost tracking helps teams monitor and optimize their usage:


def calculate_review_price(self, input_tokens, output_tokens):
    input_price_per_1k = Decimal("0.003")
    output_price_per_1k = Decimal("0.015")
    input_cost = (Decimal(input_tokens) / 1000) * input_price_per_1k
    output_cost = (Decimal(output_tokens) / 1000) * output_price_per_1k
    return input_cost, output_cost, input_cost + output_cost
4. GitLab Integration

A Lambda function handles merge request events and integrates with our change management process:


def handler(event, context):
    logger.info("Lambda function invoked")
    try:
        body = event
        merge_request_id = body["merge_request"]["iid"]
        project_id = body["project"]["id"]
        discussion_id = body["object_attributes"]["discussion_id"]

        reply = """
**Did you know that If this MR makes any production change,
Essent requires you to create a Change Request?** \n
This is a formal process to modify a product, system,
document or deliverable in a project."""

        response = GitlabHelper().reply_merge_mate_note(
            project_id, merge_request_id, discussion_id, reply
        )
5. AI Integration

AWS Bedrock powers our intelligent code reviews:


def invoke_bedrock_for_review(self, prompt, bedrock, bedrock_runtime,
                              bedrock_model_name):
    foundation_models = bedrock.list_foundation_models()
    matching_model = next(
        (model for model in foundation_models["modelSummaries"]
         if model.get("modelName") == bedrock_model_name),
        None
    )

    request_payload = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4000,
        "messages": [
            {"role": "user", "content": [{"type": "text", "text": prompt}]}
        ],
        "temperature": 1.0,
        "top_p": 0.9,
    }
6. Data Persistence

All review data and metrics are stored in DynamoDB:


def write_to_dynamodb(self, partition_key, data, bedrock_response=None,
                     error_message=None):
    timestamp = int(time.time() * 1000)
    item = {
        "PK": str(partition_key),
        "SK": str(timestamp),
        "status": "success" if not error_message else "failure",
        "timeStamp": human_readable_timestamp,
    }

    if bedrock_response:
        input_tokens = bedrock_response["usage"]["input_tokens"]
        output_tokens = bedrock_response["usage"]["output_tokens"]
        input_cost, output_cost, total_cost = (
            MrReviewer().calculate_review_price(
                input_tokens, output_tokens
            )
        )

What is AWS Bedrock?

For our code review system, we leverage AWS Bedrock, Amazon's fully managed service for foundation models. Here's why we chose it:


# Example of our Bedrock initialization and model selection
def get_bedrock_model(self, bedrock_client, model_name):
    try:
        foundation_models = bedrock_client.list_foundation_models()
        matching_model = next(
            (model for model in foundation_models["modelSummaries"]
             if model.get("modelName") == model_name),
            None
        )
        if not matching_model:
            raise ValueError(f"Model {model_name} not found in available models")
        return matching_model
    except Exception as e:
        self.logger.error(f"Error getting Bedrock model: {str(e)}")
        raise

What Are Foundational Models?

A foundational model is a large-scale deep learning model that has been trained on huge amounts of data. It can be adapted to many tasks because it learns broad knowledge and patterns. These models serve as a base for various applications, including text generation, question answering, and more.

How Large Language Models Work

Large Language Models (LLMs) predict the next word (or token) in a text based on context. They have been trained on large datasets of text so they can guess what comes next in a sentence. This guessing is the foundation for many applications, including chatbots and code reviews.

LLMs have parameters like temperature and top-k that control how they generate text:

Key Features

Lessons Learned

Future Improvements

Conclusion

Our serverless code review assistant has significantly improved our development workflow by providing consistent, automated reviews while maintaining cost effectiveness. The evolution from a simple pipeline-based approach to a sophisticated serverless architecture demonstrates the importance of learning from real-world usage and adapting to team needs.