| \documentclass[conference]{IEEEtran}
|
|
|
|
|
| \usepackage{cite}
|
| \usepackage{amsmath,amssymb,amsfonts}
|
| \usepackage{algorithmic}
|
| \usepackage{algorithm}
|
| \usepackage{graphicx}
|
| \usepackage{textcomp}
|
| \usepackage{booktabs}
|
| \usepackage{multirow}
|
| \usepackage{tikz}
|
| \usepackage{pgfplots}
|
| \pgfplotsset{compat=1.18}
|
| \usepackage{hyperref}
|
| \usepackage{cleveref}
|
|
|
|
|
| \hypersetup{
|
| colorlinks=true,
|
| linkcolor=blue,
|
| citecolor=blue,
|
| urlcolor=blue
|
| }
|
|
|
| \def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em
|
| T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}}
|
|
|
| \begin{document}
|
|
|
|
|
|
|
|
|
|
|
|
|
| \title{mBA-GMP.\textit{v3}: Micro Bid-Ask Gap-Filled Market Profile\textsuperscript{*}}
|
|
|
| \author{
|
| \href{https://github.com/ContinualQuasars}{\includegraphics[height=1.6ex]{ContinualQuasars_icon.png}}\ \textit{Continual Quasars, Research Team}\\
|
| February 20, 2026
|
| }
|
|
|
| \maketitle
|
|
|
|
|
|
|
|
|
| \begin{abstract}
|
| Conventional Market Profile (CMP) aggregates price activity into histogram
|
| bins using candlestick-derived data (TOCHLV), discarding the intra-bar
|
| microstructure and leaving price bins between consecutive trades empty. We
|
| propose \textbf{mBA-GMP} (\textit{micro Bid-Ask Gap-filled Market Profile}),
|
| a method that (i)~operates on raw, microsecond-resolution bid/ask
|
| tick-formation data rather than pre-aggregated candlesticks, and
|
| (ii)~interpolates every intermediate price bin traversed between successive
|
| ticks, producing a \emph{gap-filled} profile. Building on this gap-filled
|
| structure, we further introduce an \emph{Up/Down-Bin Footprint Profile} that
|
| classifies each bin's contribution directionally, revealing the net upward
|
| or downward pressure across the price traversal. We formalise CMP and
|
| GMP with explicit algorithms, derive the relationship between bin-count and
|
| a user-defined bin-size parameter~$\beta$, and introduce a \emph{dataframe
|
| recording approach} that walks through a 10-datapoint XAUUSD example to
|
| show how datapoints are grouped into price bins (the CMP dataframe),
|
| how gap-filling transforms the sparse CMP output into a dense GMP
|
| dataframe, and how directional footprints are assigned. We demonstrate via
|
| generated charts and CSV data that mBA-GMP yields a strictly denser and
|
| more informative distribution than CMP.
|
| \end{abstract}
|
|
|
| \begin{IEEEkeywords}
|
| Market Profile, tick data, bid-ask spread, gap-filling interpolation,
|
| high-frequency data, market microstructure, XAUUSD
|
| \end{IEEEkeywords}
|
|
|
| \vspace{0.5\baselineskip}
|
| \hrule
|
| \vspace{0.5\baselineskip}
|
|
|
| {\footnotesize\noindent\textsuperscript{*}This research is conducted by the Continual Quasars Research Team at: {\color{blue}\href{https://github.com/ContinualQuasars}{github.com/ContinualQuasars}}\par}
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \begin{tikzpicture}
|
| \begin{axis}[
|
| title={\textbf{CMP Profile}},
|
| xbar,
|
| xlabel={Stacks},
|
| ylabel={Price (USD)},
|
| ytick={3000,3001,...,3010},
|
| yticklabel style={font=\scriptsize},
|
| xmin=0, xmax=2,
|
| ymin=2999.5, ymax=3010.5,
|
| bar width=4pt,
|
| width=0.42\columnwidth,
|
| height=6.5cm,
|
| enlarge y limits=0.05,
|
| nodes near coords,
|
| nodes near coords style={font=\tiny},
|
| name=cmp
|
| ]
|
| \addplot[fill=gray!60, draw=black] coordinates {
|
| (1,3000) (0,3001) (0,3002) (0,3003) (0,3004) (0,3005)
|
| (0,3006) (0,3007) (0,3008) (0,3009) (1,3010)
|
| };
|
| \end{axis}
|
|
|
| \begin{axis}[
|
| title={\textbf{GMP Profile}},
|
| xbar,
|
| xlabel={Stacks},
|
| ylabel={},
|
| ytick={3000,3001,...,3010},
|
| yticklabel style={font=\scriptsize},
|
| xmin=0, xmax=2,
|
| ymin=2999.5, ymax=3010.5,
|
| bar width=4pt,
|
| width=0.42\columnwidth,
|
| height=6.5cm,
|
| enlarge y limits=0.05,
|
| nodes near coords,
|
| nodes near coords style={font=\tiny},
|
| at={(cmp.east)},
|
| anchor=west,
|
| xshift=1.2cm
|
| ]
|
| \addplot[fill=blue!50, draw=black] coordinates {
|
| (1,3000) (1,3001) (1,3002) (1,3003) (1,3004) (1,3005)
|
| (1,3006) (1,3007) (1,3008) (1,3009) (1,3010)
|
| };
|
| \end{axis}
|
| \end{tikzpicture}
|
| \caption{Horizontal histogram comparison of CMP (left, grey) and GMP
|
| (right, blue) for XAUUSD with $\beta=1$. CMP shows activity only at
|
| the two observed prices; GMP fills all 11~traversed bins. The
|
| gap-filling approach is most effective when applied to micro bid/ask
|
| (mBA) raw tick-formation data.}
|
| \label{fig:profile}
|
| \end{figure}
|
|
|
|
|
|
|
| \section{Introduction}\label{sec:intro}
|
|
|
| The Market Profile, introduced by Steidlmayer~\cite{steidlmayer1986market}
|
| and later formalised by Dalton et~al.~\cite{dalton2007markets}, represents
|
| price activity as a horizontal histogram whose bins correspond to discrete
|
| price levels and whose bar lengths (``stacks'') reflect the amount of
|
| activity observed at each level. In practice, most implementations
|
| construct the profile from candlestick TOCHLV (time, open, close, high, low, volume) data: each candle
|
| contributes one stack to every bin between its high and low.
|
|
|
| This approach suffers from two shortcomings:
|
|
|
| \begin{enumerate}
|
| \item \textbf{Aggregation loss.}\;Candlesticks pre-aggregate raw ticks
|
| into time-based bars chosen by the broker or exchange, irreversibly
|
| discarding the sequence and microsecond timing of individual
|
| bid/ask updates~\cite{engle2000econometrics,easley1992time}.
|
| \item \textbf{Gap neglect.}\;When consecutive \emph{raw} ticks are
|
| separated by several price levels, the conventional profile records
|
| activity only at the two observed prices, ignoring the fact that price
|
| must have traversed every intermediate level.
|
| \end{enumerate}
|
|
|
| We address both issues with \textbf{mBA-GMP}. The prefix \textit{mBA}
|
| (micro Bid-Ask) specifies the data domain: raw, micro\-/millisecond-stamped
|
| bid/ask tick-formation records---the smallest observable price changes.
|
| The suffix \textit{GMP} (Gap-filled Market Profile) specifies the
|
| construction rule: every price bin between two successive ticks receives an
|
| interpolated stack, producing a profile with no gaps.
|
|
|
|
|
| The remainder of this paper is organised as follows.
|
| \Cref{sec:related} surveys related work.
|
| \Cref{sec:prelim} establishes notation.
|
| \Cref{sec:method} defines CMP and GMP formally, presents the mBA-GMP
|
| algorithm, and introduces the Up/Down-Bin Footprint Profile.
|
| \Cref{sec:dataframe} introduces the dataframe recording approach and
|
| walks through a 10-datapoint worked example showing CMP, GMP, and footprint
|
| construction step by step.
|
| \Cref{sec:binsize} analyses the effect of bin-size on profile resolution.
|
| \Cref{sec:example} provides an additional worked example with XAUUSD.
|
| \Cref{sec:discussion} discusses practical implications, and
|
| \Cref{sec:conclusion} concludes.
|
|
|
|
|
|
|
|
|
|
|
| \section{Related Work}\label{sec:related}
|
|
|
| \subsection{Market Profile}
|
| The Market Profile concept originates with Steidlmayer's observation that
|
| price distributions at each level reveal where market participants find
|
| ``fair value''~\cite{steidlmayer1986market}. Dalton et~al.~\cite{dalton2007markets} extended the framework with auction-market
|
| theory, using half-hour brackets as time-price opportunities (TPOs). Both
|
| formulations rely on time-based bars rather than raw ticks.
|
|
|
| \subsection{Tick-Level Analysis}
|
| Clark~\cite{clark1973subordinated} demonstrated that subordinating returns
|
| to trade-count time yields closer-to-Gaussian distributions, motivating
|
| trade-indexed (rather than time-indexed) analysis.
|
| An\'{e} and Geman~\cite{ane2000order} confirmed that business-time
|
| transformations normalise returns at the tick level.
|
| Engle~\cite{engle2000econometrics} introduced econometric models tailored
|
| to ultra-high-frequency data.
|
|
|
|
|
| \subsection{Market Microstructure}
|
| The theoretical foundations of bid-ask price formation are laid out by
|
| Glosten and Milgrom~\cite{glosten1985bid}, O'Hara~\cite{ohara1995market},
|
| and the comprehensive survey of Madhavan~\cite{madhavan2000market}.
|
| Hasbrouck~\cite{hasbrouck2007empirical} provides empirical methods for
|
| tick-level inference. Bouchaud et~al.~\cite{bouchaud2018trades} present a
|
| modern, physics-inspired treatment linking order flow to price dynamics.
|
|
|
| A common thread across these works is that raw tick data preserves
|
| information lost by any form of aggregation. Our contribution is to
|
| combine this insight with a gap-filling interpolation rule applied to the
|
| Market Profile histogram.
|
|
|
|
|
|
|
|
|
|
|
| \section{Preliminaries}\label{sec:prelim}
|
|
|
| \Cref{tab:notation} summarises the notation used throughout.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{Notation Summary}
|
| \label{tab:notation}
|
| \begin{tabular}{@{}cl@{}}
|
| \toprule
|
| \textbf{Symbol} & \textbf{Description} \\
|
| \midrule
|
| $N$ & Total number of raw ticks in the dataset \\
|
| $p_i$ & Price of the $i$-th tick, $i\in\{1,\dots,N\}$ \\
|
| $\beta$ & Bin size (price units per bin); default $\beta=1$ \\
|
| $b(p)$ & Bin index of price $p$: $b(p)=\lfloor p/\beta \rfloor$ \\
|
| $S[k]$ & Stack count (profile value) at bin~$k$ \\
|
| $\Delta_i$ & Price displacement: $\Delta_i = p_i - p_{i-1}$ \\
|
| $K_i$ & Number of bins traversed from tick $i{-}1$ to $i$ \\
|
| $U[k]$ & Up-bin count at bin $k$ \\
|
| $D[k]$ & Down-bin count at bin $k$ \\
|
| $\delta[k]$ & Net footprint delta at bin $k$: $\delta[k] = U[k] - D[k]$ \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| \begin{definition}[Tick-formation]
|
| A \emph{tick-formation} is the smallest observable change in the bid or
|
| ask price as recorded by the broker. Formally, a tick stream is an
|
| ordered sequence $\mathcal{T}=\{(t_i,\,p_i)\}_{i=1}^{N}$ where $t_i$
|
| is the micro\-/millisecond timestamp and $p_i$ is the observed price.
|
| \end{definition}
|
|
|
| \begin{definition}[Bin]
|
| Given bin size $\beta>0$, the \emph{bin} for price $p$ is the integer
|
| index
|
| \begin{equation}\label{eq:bin}
|
| b(p) = \left\lfloor \frac{p}{\beta} \right\rfloor.
|
| \end{equation}
|
| All prices $p$ satisfying $k\beta \le p < (k+1)\beta$ map to bin~$k$.
|
| \end{definition}
|
|
|
|
|
| \begin{definition}[Market Profile]
|
| A \emph{market profile} is a mapping $S:\mathbb{Z}\to\mathbb{N}_0$
|
| where $S[k]$ counts the number of stacks accumulated at bin~$k$.
|
| \end{definition}
|
|
|
|
|
|
|
|
|
|
|
| \section{Methodology}\label{sec:method}
|
|
|
| \subsection{Conventional Market Profile (CMP)}\label{sec:cmp}
|
|
|
| CMP records a stack only at the bin of each observed data point:
|
| \begin{equation}\label{eq:cmp}
|
| S_{\text{CMP}}[k] \;=\; \sum_{i=1}^{N} \mathbf{1}\!\bigl[b(p_i)=k\bigr],
|
| \end{equation}
|
| where $\mathbf{1}[\cdot]$ is the indicator function. Bins with no
|
| observed tick receive $S_{\text{CMP}}[k]=0$.
|
|
|
| \smallskip
|
| \begin{algorithm}[!t]
|
| \caption{CMP Construction}\label{alg:cmp}
|
| \begin{algorithmic}[1]
|
| \REQUIRE Tick stream $\{p_i\}_{i=1}^{N}$, bin size $\beta$
|
| \ENSURE Profile array $S_{\text{CMP}}[\cdot]$
|
| \STATE Initialise $S_{\text{CMP}}[k]\leftarrow 0\;\;\forall\,k$
|
| \FOR{$i = 1$ \TO $N$}
|
| \STATE $k \leftarrow \lfloor p_i / \beta \rfloor$
|
| \STATE $S_{\text{CMP}}[k] \leftarrow S_{\text{CMP}}[k] + 1$
|
| \ENDFOR
|
| \RETURN $S_{\text{CMP}}$
|
| \end{algorithmic}
|
| \end{algorithm}
|
|
|
| \textbf{Complexity.}\;CMP performs exactly $N$ bin-index computations and
|
| $N$ increments, giving $\mathcal{O}(N)$ time complexity.
|
|
|
| \subsection{Gap-Filled Market Profile (GMP)}\label{sec:gmp}
|
|
|
| GMP augments CMP by filling every \emph{intermediate} bin between two
|
| consecutive ticks. The construction proceeds in two phases:
|
|
|
| \begin{enumerate}
|
| \item \textbf{CMP placement.}\;Each tick~$p_i$ contributes one stack to
|
| its own bin~$b(p_i)$, exactly as in CMP.
|
| \item \textbf{Gap-filling.}\;For each consecutive pair
|
| $(p_{i-1},\,p_i)$ with $i\ge 2$, every bin \emph{strictly between}
|
| $b(p_{i-1})$ and $b(p_i)$ (exclusive of both endpoints) receives one
|
| additional stack.
|
| \end{enumerate}
|
|
|
| \noindent Formally, writing $b_i = b(p_i)$:
|
|
|
| \begin{equation}\label{eq:gmp}
|
| S_{\text{GMP}}[k]
|
| \;=\;
|
| \underbrace{\sum_{i=1}^{N}\mathbf{1}\!\bigl[b_i=k\bigr]}_{S_{\text{CMP}}[k]}
|
| \;+\;
|
| \sum_{i=2}^{N}
|
| \;\sum_{j=\min(b_{i-1},\,b_i)+1}^{\max(b_{i-1},\,b_i)-1}
|
| \!\mathbf{1}\!\bigl[j=k\bigr].
|
| \end{equation}
|
|
|
| When $|b_i - b_{i-1}| \le 1$ (adjacent or same bin), the inner sum is
|
| empty and no gap-filling occurs. When $|b_i - b_{i-1}| > 1$, the number
|
| of gap-filled (intermediate) bins is
|
| \begin{equation}\label{eq:Ki}
|
| G_i \;=\; \bigl|b(p_i) - b(p_{i-1})\bigr| - 1.
|
| \end{equation}
|
| The total span of bins traversed, inclusive of both endpoints, is
|
| $K_i = G_i + 2 = |b_i - b_{i-1}| + 1$.
|
|
|
| \begin{algorithm}[!t]
|
| \caption{GMP Construction (Two-Phase)}\label{alg:gmp}
|
| \begin{algorithmic}[1]
|
| \REQUIRE Tick stream $\{p_i\}_{i=1}^{N}$, bin size $\beta$
|
| \ENSURE Profile array $S_{\text{GMP}}[\cdot]$
|
| \STATE Initialise $S_{\text{GMP}}[k]\leftarrow 0\;\;\forall\,k$
|
| \FOR{$i = 1$ \TO $N$} \COMMENT{Phase~1: CMP placement}
|
| \STATE $S_{\text{GMP}}[\lfloor p_i/\beta \rfloor] \leftarrow
|
| S_{\text{GMP}}[\lfloor p_i/\beta \rfloor] + 1$
|
| \ENDFOR
|
| \FOR{$i = 2$ \TO $N$} \COMMENT{Phase~2: gap-fill}
|
| \STATE $k_{\text{from}} \leftarrow \lfloor p_{i-1}/\beta \rfloor$;
|
| $k_{\text{to}} \leftarrow \lfloor p_i/\beta \rfloor$
|
| \IF{$|k_{\text{to}} - k_{\text{from}}| > 1$}
|
| \STATE $d \leftarrow \text{sign}(k_{\text{to}} - k_{\text{from}})$
|
| \FOR{$k = k_{\text{from}} + d$ \TO $k_{\text{to}} - d$ \textbf{step} $d$}
|
| \STATE $S_{\text{GMP}}[k] \leftarrow S_{\text{GMP}}[k] + 1$
|
| \ENDFOR
|
| \ENDIF
|
| \ENDFOR
|
| \RETURN $S_{\text{GMP}}$
|
| \end{algorithmic}
|
| \end{algorithm}
|
|
|
| \textbf{Complexity.}\;Let
|
| $D=\sum_{i=2}^{N}|b(p_i)-b(p_{i-1})|$ denote the cumulative bin
|
| displacement. GMP performs $\mathcal{O}(N + D)$ operations. In the
|
| degenerate case where all ticks share the same bin, $D=0$ and GMP reduces
|
| to CMP. In the worst case, $D=\mathcal{O}(N\cdot\Delta p_{\max}/\beta)$.
|
|
|
| \subsection{mBA-GMP: Applying GMP to Raw Tick Data}\label{sec:mba}
|
|
|
| The key contribution of mBA-GMP is \emph{not} a novel interpolation rule
|
| per~se, but rather the principled insistence that GMP must be applied to
|
| raw bid/ask tick-formation data:
|
|
|
| \begin{enumerate}
|
| \item \textbf{Data source.}\;Use the broker's micro\-/millisecond
|
| bid/ask feed---the lowest-granularity record available---rather than
|
| any TOCHLV candlestick derivative.
|
| \item \textbf{Trade indexing.}\;Index the $x$-axis by trade sequence
|
| number, not by wall-clock time (cf.~\cite{clark1973subordinated,
|
| ane2000order}).
|
| \item \textbf{Gap filling.}\;Apply \Cref{alg:gmp} to the tick stream.
|
| \end{enumerate}
|
|
|
| \begin{algorithm}[!t]
|
| \caption{mBA-GMP Pipeline}\label{alg:mba}
|
| \begin{algorithmic}[1]
|
| \REQUIRE Raw bid/ask tick feed $\mathcal{T}$, bin size $\beta$
|
| \ENSURE Gap-filled profile $S_{\text{GMP}}[\cdot]$
|
| \STATE Extract price sequence $\{p_i\}_{i=1}^{N}$ from $\mathcal{T}$,
|
| indexed by trade count
|
| \STATE $S_{\text{GMP}} \leftarrow \textsc{GMP}(\{p_i\},\,\beta)$
|
| \COMMENT{Algorithm~\ref{alg:gmp}}
|
| \RETURN $S_{\text{GMP}}$
|
| \end{algorithmic}
|
| \end{algorithm}
|
|
|
| By operating on raw ticks, mBA-GMP avoids the aggregation artefacts
|
| inherent in candlestick data~\cite{harris1990estimation} (e.g., arbitrary
|
| bar boundaries, concealed intra-bar reversals) and ensures that every
|
| micro-level price traversal is captured in the profile.
|
|
|
| \subsection{Up/Down-Bin Footprint Profile}\label{sec:updown}
|
|
|
| Building upon the gap-filled structure of GMP, we introduce a directional
|
| classification layer termed the \emph{Up/Down-Bin Footprint Profile}.
|
| Unlike order-flow bid/ask footprint charts which rely on volume traded at
|
| the bid versus the ask, our footprint is derived purely from the GMP
|
| transitive mechanics.
|
|
|
| For every consecutive pair $(p_{i-1},\,p_i)$, the trajectory is evaluated
|
| as an upward or downward movement based solely on the price difference.
|
| The origin bin $b(p_{i-1})$ is assigned no directional credit relating
|
| to this specific move (it has already been evaluated by prior action).
|
| However, every subsequent bin along the traversed path up to and including
|
| the destination bin $b(p_i)$ increments its \emph{up-bin} count $U[k]$ if $p_i > p_{i-1}$,
|
| or its \emph{down-bin} count $D[k]$ if $p_i \le p_{i-1}$.
|
|
|
| \begin{algorithm}[!t]
|
| \caption{Up/Down-Bin Footprint Construction}\label{alg:updown}
|
| \begin{algorithmic}[1]
|
| \REQUIRE Tick stream $\{p_i\}_{i=1}^{N}$, bin size $\beta$
|
| \ENSURE Profile arrays $U[\cdot], D[\cdot], \delta[\cdot]$
|
| \STATE Initialise $U[k]\leftarrow 0, D[k]\leftarrow 0\;\;\forall\,k$
|
| \FOR{$i = 2$ \TO $N$}
|
| \STATE $k_{\text{from}} \leftarrow \lfloor p_{i-1}/\beta \rfloor$;
|
| $k_{\text{to}} \leftarrow \lfloor p_i/\beta \rfloor$
|
| \IF{$k_{\text{from}} = k_{\text{to}}$}
|
| \IF{$p_i > p_{i-1}$}
|
| \STATE $U[k_{\text{from}}] \leftarrow U[k_{\text{from}}] + 1$
|
| \ELSE
|
| \STATE $D[k_{\text{from}}] \leftarrow D[k_{\text{from}}] + 1$
|
| \ENDIF
|
| \STATE \textbf{continue}
|
| \ENDIF
|
| \STATE $\text{is\_up} \leftarrow (k_{\text{to}} > k_{\text{from}})$
|
| \STATE $d \leftarrow \text{sign}(k_{\text{to}} - k_{\text{from}})$
|
| \STATE $k \leftarrow k_{\text{from}} + d$
|
| \WHILE{\textbf{true}}
|
| \IF{$\text{is\_up}$}
|
| \STATE $U[k] \leftarrow U[k] + 1$
|
| \ELSE
|
| \STATE $D[k] \leftarrow D[k] + 1$
|
| \ENDIF
|
| \IF{$k = k_{\text{to}}$}
|
| \STATE \textbf{break}
|
| \ENDIF
|
| \STATE $k \leftarrow k + d$
|
| \ENDWHILE
|
| \ENDFOR
|
| \FORALL{$k$}
|
| \STATE $\delta[k] \leftarrow U[k] - D[k]$
|
| \ENDFOR
|
| \RETURN $U,\,D,\,\delta$
|
| \end{algorithmic}
|
| \end{algorithm}
|
|
|
| This algorithm traces the same $\mathcal{O}(N+D)$ bins as the GMP phase,
|
| maintaining computational efficiency while providing deep structural
|
| insight into directional dominance across the price range.
|
|
|
|
|
|
|
|
|
|
|
| \section{Dataframe Recording Approach}\label{sec:dataframe}
|
|
|
| To build practical intuition for how CMP and GMP profiles are constructed,
|
| this section walks through a concrete 10-datapoint example using a
|
| \emph{dataframe}-style representation. Each raw data record is a triple
|
| $(\text{label},\;x,\;y)$ where \textit{label} is an alphabetic identifier,
|
| $x$ is the trade index (or time), and $y$ is the observed price.
|
| \Cref{tab:datapoints} lists the input data.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{Input Datapoints (XAUUSD Example, 10 Ticks)}
|
| \label{tab:datapoints}
|
| \begin{tabular}{@{}ccc@{}}
|
| \toprule
|
| \textbf{Datapoint} & \textbf{Trade \#} & \textbf{Price (USD)} \\
|
| \midrule
|
| A & 1 & 3000.914 \\
|
| B & 2 & 3003.837 \\
|
| C & 3 & 3002.432 \\
|
| D & 4 & 3009.892 \\
|
| E & 5 & 3007.698 \\
|
| F & 6 & 3009.176 \\
|
| G & 7 & 3003.381 \\
|
| H & 8 & 3004.283 \\
|
| I & 9 & 3003.512 \\
|
| J & 10 & 3003.012 \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| \Cref{fig:price_scatter} plots these datapoints as a price-vs-trade-index
|
| scatter chart, illustrating the raw price path that both CMP and GMP
|
| will profile.
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \includegraphics[width=\columnwidth]{fig_price_scatter.png}
|
| \caption{Price vs.\ trade index for the 10-datapoint XAUUSD example
|
| (A--J). Each point represents one raw tick-formation record.}
|
| \label{fig:price_scatter}
|
| \end{figure}
|
|
|
| \subsection{CMP Output Dataframe}\label{sec:df_cmp}
|
|
|
| Using $\beta=1$, the bin index for each tick is $b(p)=\lfloor p\rfloor$.
|
| CMP simply counts how many datapoints fall into each bin.
|
| \Cref{tab:cmp_df} shows the resulting dataframe: bins are numbered~1
|
| through~10 from the lowest observed price to the highest. The
|
| \emph{datapoint group} column records which labels landed in each bin,
|
| and \emph{stacks} is the group size.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{CMP Output Dataframe ($\beta=1$)}
|
| \label{tab:cmp_df}
|
| \begin{tabular}{@{}ccccc@{}}
|
| \toprule
|
| \textbf{Bin} & \textbf{From} & \textbf{Until}
|
| & \textbf{Group} & \textbf{Stacks} \\
|
| \midrule
|
| 1 & 3000 & 3001 & A & 1 \\
|
| 2 & 3001 & 3002 & & 0 \\
|
| 3 & 3002 & 3003 & C & 1 \\
|
| 4 & 3003 & 3004 & BGIJ & 4 \\
|
| 5 & 3004 & 3005 & H & 1 \\
|
| 6 & 3005 & 3006 & & 0 \\
|
| 7 & 3006 & 3007 & & 0 \\
|
| 8 & 3007 & 3008 & E & 1 \\
|
| 9 & 3008 & 3009 & & 0 \\
|
| 10 & 3009 & 3010 & DF & 2 \\
|
| \midrule
|
| \multicolumn{4}{c}{\textbf{Total stacks}} & \textbf{10} \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| Note that bins~2, 6, 7, and~9 have zero stacks---these are the
|
| \emph{gaps} in the CMP profile. The CMP histogram formed by the
|
| \emph{datapoint group} column is exactly the $y$-distribution
|
| histogram used by traditional Market Profile implementations.
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \includegraphics[width=0.85\columnwidth]{fig_cmp_profile.png}
|
| \caption{CMP profile for the 10-datapoint example ($\beta=1$).
|
| Four bins (2,\,6,\,7,\,9) are empty, revealing gaps in price coverage.}
|
| \label{fig:cmp_chart}
|
| \end{figure}
|
|
|
| \subsection{GMP Output Dataframe}\label{sec:df_gmp}
|
|
|
| GMP augments the CMP result by filling every intermediate bin that price
|
| must have traversed between consecutive datapoints. The gap-filling
|
| convention is:
|
|
|
| \begin{enumerate}
|
| \item Each datapoint contributes one stack to its own bin (identical to
|
| CMP).
|
| \item For each consecutive pair $(i,\,i{+}1)$, every bin strictly
|
| \emph{between} $b(p_i)$ and $b(p_{i+1})$ (exclusive of both
|
| endpoints) receives one additional stack, labelled with the
|
| source datapoint~$i$.
|
| \end{enumerate}
|
|
|
| \Cref{tab:gmp_df} shows the resulting GMP dataframe. All bins now have
|
| at least one stack---no gaps remain.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{GMP Output Dataframe ($\beta=1$)}
|
| \label{tab:gmp_df}
|
| \begin{tabular}{@{}ccccc@{}}
|
| \toprule
|
| \textbf{Bin} & \textbf{From} & \textbf{Until}
|
| & \textbf{Group} & \textbf{Stacks} \\
|
| \midrule
|
| 1 & 3000 & 3001 & A & 1 \\
|
| 2 & 3001 & 3002 & A & 1 \\
|
| 3 & 3002 & 3003 & AC & 2 \\
|
| 4 & 3003 & 3004 & BCGIJ & 5 \\
|
| 5 & 3004 & 3005 & CFH & 3 \\
|
| 6 & 3005 & 3006 & CF & 2 \\
|
| 7 & 3006 & 3007 & CF & 2 \\
|
| 8 & 3007 & 3008 & CEF & 3 \\
|
| 9 & 3008 & 3009 & CDEF & 4 \\
|
| 10 & 3009 & 3010 & DF & 2 \\
|
| \midrule
|
| \multicolumn{4}{c}{\textbf{Total stacks}} & \textbf{25} \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \includegraphics[width=0.85\columnwidth]{fig_gmp_profile.png}
|
| \caption{GMP profile for the 10-datapoint example ($\beta=1$).
|
| Every bin is populated; the profile fully represents the price
|
| range traversed by the market.}
|
| \label{fig:gmp_chart}
|
| \end{figure}
|
|
|
| \subsection{CMP vs.\ GMP Side-by-Side}\label{sec:df_compare}
|
|
|
| \Cref{fig:cmp_vs_gmp_10pt} places both profiles side by side. The
|
| contrast is striking: CMP concentrates stacks at a handful of prices
|
| (total~10~stacks), leaving 40\,\% of bins empty, while GMP distributes
|
| 25~stacks across all 10~bins, yielding a strictly denser and more
|
| informative volume-at-price distribution.
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \includegraphics[width=\columnwidth]{fig_cmp_vs_gmp.png}
|
| \caption{Side-by-side comparison of CMP (left, orange) and GMP (right,
|
| green) for the 10-datapoint XAUUSD example with $\beta=1$.}
|
| \label{fig:cmp_vs_gmp_10pt}
|
| \end{figure}
|
|
|
| \Cref{fig:combined_3panel} presents the entire pipeline---from raw
|
| datapoints, through CMP, to GMP---in a single three-panel view, with
|
| every bar annotated by its constituent datapoint letters.
|
|
|
| \begin{figure*}[!t]
|
| \centering
|
| \includegraphics[width=\textwidth]{fig_combined_3panel.png}
|
| \caption{Three-panel overview: raw datapoints (left), CMP profile
|
| with group letters (centre), and GMP profile with group letters
|
| (right). Every bar is annotated with the alphabetic labels of the
|
| datapoints it contains, making the gap-filling effect directly
|
| visible. The gap-filling approach is most effective when applied to micro bid/ask
|
| (mBA) raw tick-formation data.}
|
| \label{fig:combined_3panel}
|
| \end{figure*}
|
|
|
| \begin{quote}
|
| \textbf{Rendering note.}\;In the dataframe tables above, bin~1
|
| (lowest price) appears at the \emph{top} of the table. On an actual
|
| price chart, however, the lowest price is at the \emph{bottom} of the
|
| $y$-axis and the highest price at the top---the profile histogram is
|
| effectively ``flipped'' relative to the tabular representation.
|
| \end{quote}
|
|
|
| \subsection{Up/Down-Bin Footprint Dataframe}\label{sec:df_updown}
|
|
|
| Applying \Cref{alg:updown} to the same 10-datapoint trajectory yields
|
| the directional footprint dataframe shown in \Cref{tab:updown_df}.
|
| For example, the move from A (3000.914) to B (3003.837) causes bins 2, 3, and 4
|
| to receive $+1$ up-bin point. The movement from C (3002.432) to D (3009.892)
|
| applies up-bin points to bins 4 through 10. Downward movements, such as
|
| B down to C, or D down to E, function symmetrically. Note that
|
| the very first datapoint (A) does not carry directional value as there is
|
| no precedent movement. Price movement within the same bin is correctly
|
| assigned its respective directional label.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{Up/Down-Bin Footprint Output Dataframe ($\beta=1$)}
|
| \label{tab:updown_df}
|
| \begin{tabular}{@{}cccccrr@{}}
|
| \toprule
|
| \textbf{Bin} & \textbf{From} & \textbf{Until}
|
| & \textbf{Group} & \textbf{Down} & \textbf{Up} & \textbf{Delta} \\
|
| \midrule
|
| 1 & 3000 & 3001 & A & 0 & 0 & 0 \\
|
| 2 & 3001 & 3002 & A & 0 & 1 & +1 \\
|
| 3 & 3002 & 3003 & AC & 1 & 1 & 0 \\
|
| 4 & 3003 & 3004 & BCGIJ & 3 & 2 & -1 \\
|
| 5 & 3004 & 3005 & CFH & 1 & 2 & +1 \\
|
| 6 & 3005 & 3006 & CF & 1 & 1 & 0 \\
|
| 7 & 3006 & 3007 & CF & 1 & 1 & 0 \\
|
| 8 & 3007 & 3008 & CEF & 2 & 1 & -1 \\
|
| 9 & 3008 & 3009 & CDEF & 2 & 2 & 0 \\
|
| 10 & 3009 & 3010 & DF & 0 & 2 & +2 \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| \Cref{fig:updown_footprint} visualises this footprint as a dual-axis
|
| histogram. Each bin possesses opposing horizontal stacks indicating the
|
| total amount of upward versus downward crossing, exposing the directional bias
|
| driving the profile gap-fills.
|
|
|
| \begin{figure}[!t]
|
| \centering
|
| \includegraphics[width=\columnwidth]{fig_updown_footprint.png}
|
| \caption{Up/Down-Bin Footprint Profile for the 10-datapoint example.
|
| Red bars (left) signify down-bin gap-fills; teal bars (right) signify
|
| up-bin gap-fills. Delta values $\delta[k]$ denote net directional
|
| pressure at each price bin.}
|
| \label{fig:updown_footprint}
|
| \end{figure}
|
|
|
|
|
|
|
|
|
|
|
| \section{Effect of Bin Size on Profile Resolution}\label{sec:binsize}
|
|
|
|
|
| The bin-size parameter $\beta$ controls the granularity of the profile.
|
| For two consecutive ticks at prices $p_{i-1}$ and $p_i$, the total number
|
| of bins traversed (inclusive of both endpoints) is
|
| \begin{equation}\label{eq:bins_beta}
|
| K_i(\beta) \;=\;
|
| \left|\left\lfloor \frac{p_i}{\beta} \right\rfloor
|
| - \left\lfloor \frac{p_{i-1}}{\beta} \right\rfloor\right|
|
| + 1.
|
| \end{equation}
|
|
|
| \noindent Halving $\beta$ approximately doubles the number of interpolated
|
| bins, while doubling $\beta$ approximately halves it.
|
|
|
| \begin{proposition}[Bin-count scaling]\label{prop:scaling}
|
| For a fixed price displacement $\Delta p = |p_i - p_{i-1}|$ and bin
|
| sizes $\beta_1 > \beta_2 > 0$, the bin counts satisfy
|
| \begin{equation}\label{eq:scaling}
|
| K_i(\beta_2) \;\ge\;
|
| \left\lfloor \frac{\beta_1}{\beta_2} \right\rfloor
|
| \cdot \bigl(K_i(\beta_1) - 1\bigr) + 1.
|
| \end{equation}
|
| \end{proposition}
|
|
|
| \begin{proof}
|
| Write $\Delta p = (K_i(\beta_1)-1)\,\beta_1 + r_1$ where
|
| $0 \le r_1 < \beta_1$. Then
|
| $K_i(\beta_2) = \lfloor \Delta p / \beta_2 \rfloor + 1
|
| \ge \lfloor (K_i(\beta_1)-1)\,\beta_1 / \beta_2 \rfloor + 1
|
| \ge \lfloor \beta_1/\beta_2 \rfloor\,(K_i(\beta_1)-1) + 1$.
|
| \end{proof}
|
|
|
| \Cref{tab:binsize} illustrates how varying $\beta$ changes the GMP
|
| resolution for the XAUUSD example where price moves from \$3{,}000 to
|
| \$3{,}010 ($\Delta p = 10$).
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{Effect of Bin Size ($\beta$) on GMP Bin Count for $\Delta p = 10$}
|
| \label{tab:binsize}
|
| \begin{tabular}{@{}cccc@{}}
|
| \toprule
|
| $\beta$ (USD) & $K_i(\beta)$ & CMP bins & GMP bins filled \\
|
| \midrule
|
| 2.0 & 6 & 2 & 6 \\
|
| 1.0 & 11 & 2 & 11 \\
|
| 0.5 & 21 & 2 & 21 \\
|
| 0.25 & 41 & 2 & 41 \\
|
| 0.1 & 101 & 2 & 101 \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| \noindent
|
| Two key observations follow:
|
| \begin{itemize}
|
| \item \textbf{CMP is invariant to~$\beta$ in bin count:} regardless
|
| of~$\beta$, CMP always fills exactly~2 bins (one per observed tick),
|
| because no intermediate bins are populated.
|
| \item \textbf{GMP scales as~$\mathcal{O}(\Delta p\,/\,\beta)$:}
|
| the filled bin count grows inversely with~$\beta$, producing a
|
| progressively finer-grained profile. Setting $\beta$~below the
|
| instrument's tick size yields redundant empty bins, so the practical
|
| lower bound is $\beta \ge \text{tick\_size}$.
|
| \end{itemize}
|
|
|
| \Cref{tab:binsize_half} presents the full GMP profile comparison for
|
| $\beta=1$ versus $\beta=0.5$.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{GMP Profile: $\beta=1$ vs.\ $\beta=0.5$ (Price from \$3{,}000 to \$3{,}010)}
|
| \label{tab:binsize_half}
|
| \begin{tabular}{@{}cccc@{}}
|
| \toprule
|
| Price (USD) & CMP & GMP ($\beta\!=\!1$) & GMP ($\beta\!=\!0.5$) \\
|
| \midrule
|
| 3000.0 & 1 & 1 & 1 \\
|
| 3000.5 & 0 & & 1 \\
|
| 3001.0 & 0 & 1 & 1 \\
|
| 3001.5 & 0 & & 1 \\
|
| 3002.0 & 0 & 1 & 1 \\
|
| 3002.5 & 0 & & 1 \\
|
| 3003.0 & 0 & 1 & 1 \\
|
| 3003.5 & 0 & & 1 \\
|
| 3004.0 & 0 & 1 & 1 \\
|
| 3004.5 & 0 & & 1 \\
|
| 3005.0 & 0 & 1 & 1 \\
|
| 3005.5 & 0 & & 1 \\
|
| 3006.0 & 0 & 1 & 1 \\
|
| 3006.5 & 0 & & 1 \\
|
| 3007.0 & 0 & 1 & 1 \\
|
| 3007.5 & 0 & & 1 \\
|
| 3008.0 & 0 & 1 & 1 \\
|
| 3008.5 & 0 & & 1 \\
|
| 3009.0 & 0 & 1 & 1 \\
|
| 3009.5 & 0 & & 1 \\
|
| 3010.0 & 1 & 1 & 1 \\
|
| \midrule
|
| \textbf{Total bins} & \textbf{2} & \textbf{11} & \textbf{21} \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
|
|
|
|
|
|
|
|
| \section{Illustrative Example}\label{sec:example}
|
|
|
| Consider two raw XAUUSD ticks with $\beta=1$:
|
| \begin{itemize}
|
| \item Tick~1: trade at $p_1 = \$3{,}000$.
|
| \item Tick~2: trade at $p_2 = \$3{,}010$.
|
| \end{itemize}
|
|
|
| \Cref{tab:cmp_vs_gmp} shows the resulting profiles side-by-side.
|
|
|
| \begin{table}[!t]
|
| \centering
|
| \caption{CMP vs.\ GMP Comparison ($\beta=1$, XAUUSD)}
|
| \label{tab:cmp_vs_gmp}
|
| \begin{tabular}{@{}cccc@{}}
|
| \toprule
|
| Trade \# & Price (USD) & CMP stacks & GMP stacks \\
|
| \midrule
|
| 1 & 3000 & 1 & 1 \\
|
| 0 & 3001 & 0 & 1 \\
|
| 0 & 3002 & 0 & 1 \\
|
| 0 & 3003 & 0 & 1 \\
|
| 0 & 3004 & 0 & 1 \\
|
| 0 & 3005 & 0 & 1 \\
|
| 0 & 3006 & 0 & 1 \\
|
| 0 & 3007 & 0 & 1 \\
|
| 0 & 3008 & 0 & 1 \\
|
| 0 & 3009 & 0 & 1 \\
|
| 2 & 3010 & 1 & 1 \\
|
| \midrule
|
| \multicolumn{2}{c}{\textbf{Total stacks}} & \textbf{2} & \textbf{11} \\
|
| \bottomrule
|
| \end{tabular}
|
| \end{table}
|
|
|
| CMP records only 2~stacks at the observed prices; GMP records
|
| 11~stacks spanning the full traversal. \Cref{fig:profile} visualises
|
| both profiles as horizontal histograms.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| \section{Discussion}\label{sec:discussion}
|
|
|
| \subsection{Advantages}
|
|
|
| \begin{enumerate}
|
| \item \textbf{Maximal resolution.}\;By operating on raw bid/ask ticks,
|
| mBA-GMP captures every price movement the broker records---no
|
| information is pre-aggregated or discarded.
|
| \item \textbf{No profile gaps.}\;Gap-filling ensures that every price
|
| level traversed by the market is represented, preventing the sparse,
|
| misleading histograms produced by CMP on fast-moving ticks.
|
| \item \textbf{Volume-neutral interpolation.}\;Interpolated bins receive
|
| exactly one stack each, reflecting a traversal rather than fabricating
|
| volume. This preserves the interpretive semantics of the profile:
|
| high-stack regions still correspond to genuine price acceptance.
|
| \item \textbf{Directional Context.}\;By classifying gap-filled stacks into
|
| up/down bins, the resultant footprint profile reveals net directional
|
| pressure across the evaluated interval, independent of conventional
|
| bid/ask volume mechanics.
|
| \item \textbf{Tunable resolution via~$\beta$.}\;The bin-size parameter
|
| allows practitioners to control profile granularity without altering
|
| the underlying data, unlike candlestick-based approaches where
|
| resolution is fixed by the bar period.
|
| \end{enumerate}
|
|
|
| \subsection{Limitations}
|
|
|
| \begin{enumerate}
|
| \item \textbf{Data availability.}\;Not all brokers expose raw
|
| micro\-/millisecond tick feeds. Where only TOCHLV data is available,
|
| GMP can still be applied to candlestick prices, but the
|
| ``mBA'' guarantee is lost.
|
| \item \textbf{Computational cost.}\;The $\mathcal{O}(N + D)$ complexity
|
| implies that highly volatile instruments with large cumulative
|
| displacement~$D$ will require proportionally more computation. For
|
| modern hardware this is rarely a practical bottleneck, but
|
| memory-constrained environments may require streaming or windowed
|
| implementations.
|
| \item \textbf{Interpolation assumption.}\;Gap-filling assumes that price
|
| continuously traverses every intermediate level. In instruments with
|
| genuine price gaps (e.g., exchange-traded equities at market open),
|
| this assumption may over-represent bins that were never actually
|
| available for trading.
|
| \end{enumerate}
|
|
|
| \subsection{Practical Guidance on Choosing $\beta$}
|
|
|
| \begin{itemize}
|
| \item Set $\beta$ near to the instrument's minimum tick size for maximum
|
| resolution (e.g., $\beta = 0.01$ for XAUUSD's 0.001 lowest tick, on many brokers).
|
| \item Increase $\beta$ to reduce noise in low-liquidity regimes or to
|
| align bins with round-number psychological levels.
|
| \item As shown in \Cref{sec:binsize}, CMP bin count is invariant
|
| to~$\beta$; thus, the resolution advantage of GMP grows as $\beta$
|
| decreases.
|
| \end{itemize}
|
|
|
|
|
|
|
|
|
|
|
| \section{Conclusion}\label{sec:conclusion}
|
|
|
| We have presented \textbf{mBA-GMP}, a market-profile construction method
|
| that combines two principles: (i)~sourcing data from raw micro\-/millisecond
|
| bid/ask tick-formation records rather than pre-aggregated candlesticks,
|
| and (ii)~interpolating all intermediate price bins between consecutive
|
| ticks. We formalised both Conventional Market Profile (CMP) and
|
| Gap-filled Market Profile (GMP), provided pseudocode algorithms with
|
| complexity analysis, and demonstrated that the bin-size parameter~$\beta$
|
| controls profile resolution with a simple inverse relationship.
|
|
|
| The dataframe recording approach (\Cref{sec:dataframe}) showed concretely
|
| how 10~raw datapoints map to CMP bins with gaps, how gap-filling
|
| produces a GMP dataframe in which every price bin is populated, and how
|
| these intermediate bins are classified into directional up/down stacks
|
| to yield a structural footprint. The accompanying charts and tabulated
|
| outputs make the method reproducible and directly applicable to real-world
|
| tick streams.
|
|
|
| Future directions include extending the gap-filling convention to
|
| weighted interpolation (where intermediate bins receive fractional stacks
|
| proportional to their traversal speed) and evaluating mBA-GMP on
|
| live order-book data across multiple asset classes.
|
|
|
|
|
| \newpage
|
|
|
|
|
|
|
| \bibliographystyle{IEEEtran}
|
| \bibliography{references}
|
|
|
| \end{document}
|
|
|