MCP Use Case on Azure - Memento Memory Storage Demo

July 18, 2025

Problem Statement

The Model Context Protocol (MCP) opens up new possibilities for creating AI-powered applications that can interact with various data sources and services. However, understanding how to deploy MCP servers in cloud environments and integrate them with AI services can be challenging for developers new to the protocol. There's a need for practical examples that demonstrate MCP's capabilities while showcasing real-world deployment patterns on cloud platforms like Azure.

Additionally, what if there's a need to provide an interface that uses natural language to store and retrieve data on cloud-backed persistent volumes, such as Azure Files? Traditional file storage interfaces require users to navigate complex directory structures, remember exact file names, and use specific query syntax. This creates friction when users want to quickly store information or find content using conversational, natural language interactions. The challenge becomes even more complex when considering multi-user scenarios, remote access patterns, and the need for scalable cloud-native deployments.

Solution

This blog post explores a simple yet practical MCP use case called "Memento" - a personal memory storage system that allows users to store and retrieve notes, meeting summaries, and other content using natural language. We'll walk through how this MCP server is deployed on Azure Kubernetes Service (AKS) with Azure Files for persistent storage.

Important Note: This example is designed for educational and demonstration purposes. Security considerations are intentionally simplified and will be addressed in future blog posts. This implementation is NOT suitable for production use without significant security enhancements.

What is Memento?

Memento is an MCP server that provides users with a natural language interface to:

  • Store personal memories, notes, and content
  • Retrieve information using natural language queries
  • Organize content with tags and metadata
  • Access data remotely from local workstations

The system demonstrates several key concepts:

  • Remote MCP server deployment
  • Cloud-native storage integration
  • Multi-user isolation
  • Natural language processing with Azure OpenAI

Architecture Overview

The current implementation represents the evolution from a simple local file-based system to a comprehensive Kubernetes deployment:

┌─────────────────────────────────────────────────────────┐
│                Local Workstation                        │
│  ┌─────────────────┐                                   │
│  │ MCP Client      │                                   │
│  │ Interactive     │ ◄──── HTTPS/SSE ────┐            │
│  │                 │                      │            │
│  │ memento_mcp_    │                      │            │
│  │ client_         │                      │            │
│  │ interactive.py  │                      │            │
│  └─────────────────┘                      │            │
└────────────────────────────────────────────┼────────────┘
                                             │
                                             ▼
┌─────────────────────────────────────────────────────────┐
│              Azure Cloud Services                       │
│                                                         │
│  ┌───────────────────────────────────────────────────┐ │
│  │        Azure Kubernetes Service (AKS)             │ │
│  │                                                   │ │
│  │  ┌─────────────────┐    ┌─────────────────┐      │ │
│  │  │ Load Balancer   │    │ Azure Files     │      │ │
│  │  │                 │    │ Persistent      │      │ │
│  │  │ Public IP       │    │ Volume          │      │ │
│  │  │ IP Whitelist    │    │                 │      │ │
│  │  └─────────────────┘    │ /memento_       │      │ │
│  │           │              │ storage/        │      │ │
│  │           │              │ ├── alice/      │      │ │
│  │           ▼              │ ├── bob/        │      │ │
│  │  ┌─────────────────┐    │ └── charlie/    │      │ │
│  │  │ MCP Server Pod  │◄───┤                 │      │ │
│  │  │                 │    └─────────────────┘      │ │
│  │  │ memento_mcp_    │                             │ │
│  │  │ server.py       │                             │ │
│  │  │                 │                             │ │
│  │  │ 0.0.0.0:8000    │                             │ │
│  │  └─────────────────┘                             │ │
│  └───────────────────────────────────────────────────┘ │
│                                                         │
│  ┌─────────────────┐                                   │
│  │ Azure OpenAI    │                                   │
│  │ Service         │                                   │
│  │                 │                                   │
│  │ - GPT-4o-mini   │                                   │
│  │ - Tool Calling  │                                   │
│  │ - Chat API      │                                   │
│  └─────────────────┘                                   │
└─────────────────────────────────────────────────────────┘

Key Components

1. MCP Client (Local)

  • Runs on local workstation
  • Integrates with Azure OpenAI for natural language processing
  • Uses Server-Sent Events (SSE) for real-time communication
  • Supports multi-user scenarios with user isolation

2. MCP Server (AKS)

  • Built using FastMCP framework
  • Deployed as Kubernetes pods on AKS
  • Provides memory operations: store, retrieve, search, delete
  • Handles user isolation and session management

3. Storage Layer (Azure Files)

  • Persistent storage mounted to AKS pods
  • Multi-user directory structure
  • Supports concurrent access from multiple pods
  • Each user gets isolated storage space

4. AI Integration (Azure OpenAI)

  • Natural language processing for memory operations
  • Tool calling capability for MCP functions
  • GPT-4o-mini model for efficient processing

Data Flow Example

  1. User Input: "Store this meeting summary about the Q4 planning session"
  2. Local Processing: MCP Client processes the request with Azure OpenAI
  3. Tool Calling: Azure OpenAI generates appropriate tool calls (store_memory)
  4. Remote Execution: Client sends tool calls to AKS-hosted MCP Server via SSE
  5. Storage: MCP Server writes data to Azure Files with user isolation
  6. Response: Success confirmation flows back to user

Key Technical Learnings

Network Binding Challenge One interesting technical challenge was that FastMCP hardcodes localhost binding, making it unreachable from Kubernetes services. This was solved using Python monkey patching:

# Monkey patch uvicorn's Config to force host binding
import uvicorn.config
original_config_init = uvicorn.config.Config.__init__

def patched_config_init(self, app, *args, **kwargs):
    kwargs['host'] = '0.0.0.0'  # Listen on all interfaces
    kwargs['port'] = 8000
    return original_config_init(self, app, *args, **kwargs)

uvicorn.config.Config.__init__ = patched_config_init

Session Affinity For stable SSE connections, the Kubernetes service uses ClientIP session affinity to ensure requests from the same client always reach the same pod:

sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 10800  # 3 hours

User Isolation The system maintains user isolation through directory-based separation in Azure Files:

/memento_storage/
├── alice/
│   ├── 20250118_memory.txt
│   └── 20250118_memory.txt.meta
├── bob/
│   └── ...
└── charlie/
    └── ...

Configuration

The system uses environment-driven configuration for flexibility:

# .env file
MCP_SERVER_HOSTNAME=AKS_LOAD_BALANCER_IP
MCP_SERVER_PORT=8000
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_API_KEY=your-api-key

Security Considerations

Current State: This demo uses minimal security for educational purposes only.

Future Enhancements Required for Production:

  • User authentication and authorization
  • TLS/SSL encryption for data in transit
  • Load balancer with Private IP vs public IP
  • Azure Files encryption at rest
  • RBAC and fine-grained permissions
  • Comprehensive audit logging

Deployment Process

Warning: This deployment is for demonstration and development purposes only.

  1. Build and Push Container:

    az acr build --registry your-registry --image memento-mcp-server:v1.0.5 .
  2. Deploy to AKS:

    kubectl apply -f k8s/namespace.yaml
    kubectl apply -f k8s/deployment.yaml
    kubectl apply -f k8s/service.yaml
  3. Configure Client: Update .env file with AKS Load Balancer IP

  4. Test Connection:

    python memento_mcp_client_interactive.py

Future Roadmap

This Memento example serves as a foundation for learning MCP by progressively adding features:

  1. Security Focus (Next Blog): Authentication, encryption, and secure access patterns
  2. Advanced Features: Enhanced search capabilities, content versioning
  3. Monitoring & Observability: Logging, metrics, and health checks
  4. Production Readiness: High availability, backup strategies, and operational excellence

Conclusion

The Memento MCP demo demonstrates the evolution from simple local file systems to comprehensive cloud-native deployments. Key learning outcomes include:

  • Scalability Concepts: Kubernetes deployment with multiple replicas
  • Persistence Patterns: Azure Files for durable, shared storage
  • Remote Access Patterns: Secure communication from local clients to cloud services
  • Multi-tenancy Basics: User isolation in shared storage environments
  • AI Integration: Natural language interfaces with Azure OpenAI
  • Network Considerations: Session affinity for stable long-lived connections

This foundation showcases MCP's potential in cloud-native environments while highlighting real-world challenges and practical solutions. The next blog post will build upon this foundation to address production-ready security patterns and operational excellence.

For a detailed implementation reference, check out the architecture document.
For source code and other files, check out the complete source code and documentation. For a sample demo video sample video link here


Profile picture

Written by Sridher Manivel Based out of Charlotte, NC. Linkedin