Belfast: Reverse engineering a mobile game server

In December 2024, I decided to quit a game.

Instead of doing the healthy thing and uninstalling it, I reverse engineered the protocol.

This post is the technical write-up I wish I had when I started: not just “I made a private server,” but how packet framing works, how login flows are stitched together, how game state is persisted, and which architectural choices made the project survivable.

Belfast is a Go server that speaks Azur Lane’s client protocol, with:

Custom TCP transport and packet framing
Protobuf-based request/response handling
Gameplay handlers grouped by domain
PostgreSQL persistence (migrations + SQLC)
Embedded admin API (Iris + Swagger)
Packet analysis and progress tooling

It did not grow linearly. There was an early proof-of-concept phase, a long pause, and then a second phase where architecture and tooling became non-negotiable.

It’s not (just) about anime girls and lewdness

The original motivation was educational: learn mobile reverse engineering on a real system with real constraints.

The first week looked like this:

PCAPs with no obvious structure
Fragmented ADB/Unity logs
Partial hints from extracted client resources
A lot of wrong assumptions

The methodological shift was simple but important: stop trying to decode gameplay semantics first. Lock down deterministic layers in order:

Frame boundaries
Packet IDs
Payload encoding
State transitions
Gameplay semantics

Without that ordering, every experiment looks random.

Wire protocol, decoded

The first practical breakthrough came from a tiny header sample:

0x01 0x89 0x00 0x2a 0x31 0x00

0x2a31 is 10801 in decimal, which maps to SC_10801. Once that clicked, the framing model became clear.

In Belfast (internal/packets/magic.go), the frame structure is treated as a strict contract:

2 bytes: packet size
1 byte: sentinel (0x00)
2 bytes: packet ID
2 bytes: packet index
N bytes: protobuf payload

func GetPacketId(offset int, buffer *[]byte) int {
	var id int
	id = int((*buffer)[3+offset]) << 8
	id += int((*buffer)[4+offset])
	return id
}

func GetPacketSize(offset int, buffer *[]byte) int {
	var size int
	size = int((*buffer)[0+offset]) << 8
	size += int((*buffer)[1+offset])
	return size
}

On egress, headers are rebuilt explicitly instead of relying on implicit middleware. That makes packet reproduction deterministic and debuggable.

func GeneratePacketHeader(packetId int, payload *[]byte, packetIndex int) []byte {
	var buffer bytes.Buffer
	payloadSize := len(*payload) + 5
	buffer.Write([]byte{byte(payloadSize >> 8), byte(payloadSize)})
	buffer.Write([]byte{0x00})
	buffer.Write([]byte{byte(packetId >> 8), byte(packetId)})
	buffer.Write([]byte{byte(packetIndex >> 8), byte(packetIndex)})
	return buffer.Bytes()
}

One subtle detail that matters in practice: packet index is often 0x0000, but can be 0x0001 in multi-packet frames. Ignoring it can cause “almost works” behavior that is painful to debug.

Bootstrap flow (real packets, real handlers)

After framing, the next challenge is consistency: the boot/login sequence has to be reproduced in the right order with coherent state.

Typical sequence:

CS_10800 -> SC_10801 (Update check)
CS_10700 -> SC_10701 (Gateway info)
CS_10020 -> SC_10021 (Auth confirm + server list)
CS_10022 -> SC_10023 (Join server)
CS_10024 -> SC_10025 (Create player, if needed)
CS_11001 -> fan-out of initial state sync packets

`CS_10020` / `SC_10021`: identity bootstrap

HandleAuthConfirm (internal/answer/auth_confirm.go) binds login input to account identity, then emits a server ticket and server list.

intArg2, err := strconv.Atoi(payload.GetArg2())
if err != nil {
	return 0, 10021, fmt.Errorf("failed to convert arg2 to int: %s", err.Error())
}
client.AuthArg2 = uint32(intArg2)
protoValidAnswer.ServerTicket = proto.String(formatServerTicket(client.AuthArg2))

yostarusAuth, err := orm.GetYostarusMapByArg2(uint32(intArg2))
if err != nil && db.IsNotFound(err) && config.Current().CreatePlayer.SkipOnboarding {
	accountID, err := client.CreateCommander(uint32(intArg2))
	if err != nil {
		return 0, 10021, err
	}
	protoValidAnswer.AccountId = proto.Uint32(accountID)
}

Architectural point: this packet is not just “auth yes/no.” It is where account creation strategy is decided (skip_onboarding path), which directly affects downstream packet expectations.

`CS_10022` / `SC_10023`: session coherence

JoinServer (internal/answer/join_server.go) resolves account identity from multiple sources (account_id, device_id, server ticket), loads commander state, and enforces one active session per commander.

if client.Server != nil {
	existingKicked := client.Server.DisconnectCommander(
		client.Commander.CommanderID,
		consts.DR_LOGGED_IN_ON_ANOTHER_DEVICE,
		client,
	)
	if existingKicked {
		logger.LogEvent("Server", "LoginKick",
			fmt.Sprintf("kicked previous session for commander %d", client.Commander.CommanderID),
			logger.LOG_LEVEL_INFO)
	}
}

This is one of those choices that prevents a lot of weirdness: duplicate active sessions can create impossible state races if left unchecked.

`CS_10024` / `SC_10025`: account creation guardrails

CreateNewPlayer (internal/answer/onboarding/create_new_player.go) enforces name policy, starter ship validity, and device/account binding constraints before provisioning state.

nameLength := utf8.RuneCountInString(nickname)
if nameLength < createPlayerNameMin {
	response.Result = proto.Uint32(2012)
	return client.SendMessage(10025, &response)
}
if nameLength > createPlayerNameMax {
	response.Result = proto.Uint32(2011)
	return client.SendMessage(10025, &response)
}

if _, ok := starterShipIDs[shipID]; !ok {
	response.Result = proto.Uint32(1)
	return client.SendMessage(10025, &response)
}

This keeps onboarding behavior deterministic and protects future reconnect flow via stable device mapping.

Transport and dispatch architecture

The networking stack is intentionally explicit. When reverse engineering a binary protocol, “clever” transport abstractions usually hurt more than they help.

Server side (`internal/connection/server.go`)

Accept TCP connection
Validate maintenance/private-client constraints
Read from socket into ring buffer
Parse packet size first, then body
Enqueue frames into per-client queue

Client side (`internal/connection/client.go`)

Bounded queue (packetQueueSize = 512)
Reusable packet buffer pool (packetPoolSize = 128)
Dedicated dispatch loop goroutine
Backpressure when queue is full
Runtime metrics (queue depth, blocks, errors, packets)

Dispatch layer (`internal/packets/handler.go`)

Dispatch resolves handlers by packet ID, applies all handlers for that packet, and flushes buffered writes afterward.

handlers, ok := PacketDecisionFn[packetId]
headerlessBuffer := (*buffer)[offset+HEADER_SIZE:]
if !ok {
	_, _, err := client.SendMessage(10998, &protobuf.SC_10998{
		Cmd:    proto.Uint32(uint32(packetId)),
		Result: proto.Uint32(1),
	})
	if err != nil {
		return
	}
} else {
	for _, handler := range handlers {
		_, _, err := handler(&headerlessBuffer, client)
		if err != nil {
			client.CloseWithError(err)
			return
		}
	}
}

Architectural choice that paid off: handlers write to a client buffer and dispatch flushes once per pass. That reduces syscall churn and keeps ordering deterministic within one frame-processing cycle.

Region-aware routing instead of region spaghetti

Azur Lane behavior differs by region (CN/EN/JP/KR/TW). Belfast handles that at registration time, not deep inside every handler.

packets.RegisterLocalizedPacketHandler(13101, packets.LocalizedHandler{
	CN: &[]packets.PacketHandler{answer.ChapterTracking},
	EN: &[]packets.PacketHandler{answer.ChapterTracking},
	JP: &[]packets.PacketHandler{answer.ChapterTracking},
	KR: &[]packets.PacketHandler{answer.ChapterTrackingKR},
	TW: &[]packets.PacketHandler{answer.ChapterTracking},
})

This keeps packet-specific logic focused on behavior, while region variability stays in one predictable place.

Persistence and migration discipline

Gameplay packets are state transitions, so persistence must be boring and strict.

Stack:

PostgreSQL
SQLC-generated query layer
ORM/domain loading helpers
Embedded migration runner with checksums

if _, err := lockConn.ExecContext(acquireCtx,
	`SELECT pg_advisory_lock($1, $2)`,
	migrationAdvisoryLockClassID,
	lockObjectID,
); err != nil {
	return err
}

if appliedChecksum, ok := applied[m.Version]; ok {
	if appliedChecksum != m.Checksum {
		return fmt.Errorf("migration %d already applied but checksum changed", m.Version)
	}
	continue
}

That lock + checksum pair is not glamorous, but it prevents migration races and silent drift across environments.

Game data ingestion as a first-class system

Most gameplay handlers depend on external game datasets (ships, chapter templates, shop data, etc.).

misc.UpdateAllData orchestrates importer functions that fetch JSON from belfast-data and upsert via SQLC.

err := db.DefaultStore.WithTx(ctx, func(q *gen.Queries) error {
	for _, key := range order {
		fn := dataFnSQLC[key]
		if fn == nil {
			return fmt.Errorf("missing sqlc importer for %s", key)
		}
		if err := fn(ctx, region, q); err != nil {
			return err
		}
	}
	return nil
})

Design choice: ingestion is centralized and ordered, which makes reseeding reproducible and easier to reason about after updates.

Chapter system deep dive (where this became real)

Chapter flow is where packet emulation turns into game simulation.

Core handlers in internal/answer/chapter:

CS_13101 -> SC_13102 (tracking/start)
CS_13103 -> SC_13104 (actions)
CS_13106 -> SC_13105 (battle result request)
SC_13000 (base sync)

Start/tracking (`CS_13101`)

ChapterTracking computes resource costs, validates inventory, builds CURRENTCHAPTERINFO, then persists it.

baseOil := template.Oil
oilCost := uint32(float64(baseOil) * rate)
if !client.Commander.HasEnoughResource(2, oilCost) {
	response := protobuf.SC_13102{Result: proto.Uint32(1)}
	return client.SendMessage(13102, &response)
}

if oilCost > 0 {
	if err := client.Commander.ConsumeResource(2, oilCost); err != nil {
		return 0, 13102, err
	}
}

Move/action (`CS_13103`)

Movement uses BFS over walkable chapter grid cells, then updates fleet position and step counters.

start := chapterPos{Row: group.GetPos().GetRow(), Column: group.GetPos().GetColumn()}
end := chapterPos{Row: payload.GetActArg_1(), Column: payload.GetActArg_2()}
path := findMovePath(grids, start, end)
if len(path) == 0 {
	response := protobuf.SC_13104{Result: proto.Uint32(1)}
	return client.SendMessage(13104, &response)
}

stepDelta := uint32(len(path) - 1)
group.Pos = buildPos(end)
group.StepCount = proto.Uint32(group.GetStepCount() + stepDelta)
current.MoveStepCount = proto.Uint32(current.GetMoveStepCount() + stepDelta)

Ambush rates

Ambush logic mirrors client-side formulas documented in the codebase. This is one of the highest-fidelity areas because players notice statistical drift quickly.

rate := 0.05 + posExtra + globalExtra
if step > 0 {
	denom := inv + investSums
	if denom > 0 {
		rate += (inv / denom) / 4 * float64(step)
	}
}
if posExtra == 0 {
	rate -= calculateFleetEquipAmbushRateReduce(group, client)
}
rate = clampChance(rate)
return uint32(rate * chapterChanceBase)

Tooling that paid for itself

PCAP decoder (`cmd/pcap_decode`)

The decoder reconstructs TCP streams, parses protocol frames, auto-decodes protobuf payloads via reflection, and emits JSON lines.

packetID := int(binary.BigEndian.Uint16(buffer[3:5]))
packetIndex := int(binary.BigEndian.Uint16(buffer[5:7]))
payload := buffer[packets.HEADER_SIZE:frameSize]

if constructor, ok := s.registry[packetID]; ok {
	msg := constructor()
	if err := proto.Unmarshal(payload, msg); err != nil {
		record.Error = err.Error()
		record.RawHex = hex.EncodeToString(payload)
	} else {
		record.JSON, _ = protojson.MarshalOptions{EmitUnpopulated: true}.Marshal(msg)
	}
}

Gateway dumper (`cmd/gateway_dump`)

I also built a tiny gateway dumper for one specific reconnaissance task.

Its flow is intentionally tiny:

Dial a gateway address (--addr host:port) with bounded timeout
Send CS_10018 with an empty payload (the handler does not need request fields)
Read exactly one framed reply and assert packet ID SC_10019
Protobuf-unmarshal the server list, then JSON-print each server (ids, name, ip, port, state, optional proxy fields)

Operationally, the tool is strict on purpose: it sets a connection deadline, validates frame size before reading body bytes, and exits on any packet-ID mismatch instead of trying to recover. CLI defaults are practical for fast scans (--addr 127.0.0.1:80, --timeout-ms 5000, optional --pretty JSON).

Core request logic:

payload := []byte{}
header := connection.GeneratePacketHeader(10018, &payload, 0)
if _, err := conn.Write(header); err != nil {
	return nil, fmt.Errorf("write CS_10018: %w", err)
}

pkt, err := readOnePacket(conn)
if err != nil {
	return nil, err
}
packetID := packets.GetPacketId(0, &pkt)
if packetID != 10019 {
	return nil, fmt.Errorf("unexpected response packet id %d (expected 10019)", packetID)
}

With that, a friend and I scanned IP ranges and cross-checked targets from constants embedded in the game client, then compared the returned server lists across regions/builds. One result was an Audit server entry, likely used for store submission/QA environments.

We connected once, completed the in-game tutorial flow, and created accounts with our nicknames, then stopped there. It was mostly a nerdy “oh wow, that is real” moment before backing out.

Packet recorder

debug.InsertPacket stores payloads for post-mortem analysis.

func InsertPacket(packetId int, payload *[]uint8) {
	if packetId == 8239 {
		return
	}
	err := orm.InsertDebugPacket(len(*payload), packetId, *payload)
	if err != nil {
		logger.LogEvent("Debug", "InsertPacket", err.Error(), logger.LOG_LEVEL_ERROR)
	}
}

Coverage/progress metrics

cmd/packet_progress was one of the highest leverage tools I built because it solved a chronic reverse-engineering problem: everyone says “coverage is pretty good,” but nobody can answer “good by what metric?”

The command walks packet registrations, parses handler ASTs, applies heuristic scoring, and emits machine-readable reports. It does not execute handlers; it infers implementation depth from source signals.

Status model:

implemented: strong request/response + behavior signals
partial: meaningful logic present, but likely incomplete
stub: minimal acknowledgment behavior
panic: known bad path
missing: registered/known packet with no effective implementation

const (
	statusImplemented = "implemented"
	statusPartial     = "partial"
	statusStub        = "stub"
	statusPanic       = "panic"
	statusMissing     = "missing"
)

Scoring is weighted instead of binary. For example, SendMessage, protobuf parse/setter usage, commander/ORM usage, and DB writes all contribute to confidence.

Weights: heuristicWeights{
	SendMessage:  3,
	ResponseType: 2,
	RequestType:  1,
	ProtoSetter:  1,
	RequestParse: 1,
	CommanderUse: 2,
	ORMUsage:     2,
	DBWrite:      2,
},
Thresholds: heuristicThresholds{ImplementedMin: 4}

The output includes packet-level and handler-level diagnostics (score, signals, file, line), plus overrides for known exceptions. In practice, this makes roadmap planning much easier:

You can sort by high-value missing packets.
You can detect regressions when a refactor drops implementation signals.
You can separate “known stub” from “silently broken” behavior.

This tool changed planning from intuition to a repeatable coverage process.

The LLM testing loop

Once protocol and gameplay coverage expanded, the bottleneck became repetitive manual UI navigation on Android.

Belfast already had strong ADB primitives (internal/debug/adb_watcher.go):

Interactive controls
Logcat start/stop/flush/dump
Process PID tracking
Optional game restart automation

I added an external MCP-style loop on top:

Capture screenshot
Model infers current UI state
Model selects tap target
Send ADB input
Observe logcat + server behavior
Repeat until scenario complete or broken

I do not treat the model as a source of truth. I treat it as a repeatable integration test operator for tedious client flows.

What I learned

Reverse engineering is mostly systems engineering, not cinematic breakthroughs.
Packet-level correctness is necessary, but gameplay semantics are the real finish line.
Boring layers win: deterministic transport, strict persistence, measurable coverage.
Stubs are not failure; they are scaffolding.
Once your test loop is too manual, automation quality dominates delivery speed.

And yes, one of the biggest breakthroughs was still converting hex to decimal in GNOME Calculator.

What comes next

Belfast now has reliable client boot, broad gameplay coverage, and an architecture that can absorb incremental updates. The remaining challenge is not “can this packet be implemented?” but “can updates be absorbed without creating a maintenance tax spiral?”

The roadmap I am currently converging toward has four tracks:

Protocol diff pipeline
- Automatically compare old/new client protobuf surfaces.
- Detect new/renamed packet IDs and field-level drift.
- Generate change candidates before manual reverse engineering starts.
Data synchronization hardening
- Version game data imports per region.
- Add strict import validation (missing IDs, shape drift, invalid cross-references).
- Keep rollback-friendly snapshots for fast bisect when behavior regresses.
Scenario-based regression suite
- Convert manual test routes into scripted scenarios (login, chapter start, battle result, shop flows).
- Pair server-side packet assertions with client-side UI/log assertions.
- Use the LLM loop as a driver, but keep deterministic pass/fail checks server-side.
Handler maintenance ergonomics
- Expand packet coverage metrics with domain tags (auth/chapter/shop/world).
- Rank missing packets by runtime frequency and player-facing impact.
- Generate focused implementation backlogs instead of a flat “missing list.”

The goal is straightforward: make new client versions boring to integrate.

The funny part is that none of this was the plan. I started this project because I wanted to stop playing a game, and somehow ended up building a protocol emulator, a data pipeline, and an AI-assisted test harness around it.

That is probably my favorite thing about this kind of work: if you stay curious long enough, a small reverse-engineering experiment can quietly turn into a real software system.