| | \documentclass[conference]{IEEEtran}
|
| |
|
| |
|
| | \usepackage[utf8]{inputenc}
|
| | \usepackage[T1]{fontenc}
|
| | \usepackage{amsmath, amssymb, amsfonts}
|
| | \usepackage{graphicx}
|
| | \usepackage{booktabs}
|
| | \usepackage{hyperref}
|
| | \usepackage{float}
|
| | \usepackage{caption}
|
| | \usepackage{subcaption}
|
| | \usepackage{xcolor}
|
| | \usepackage{enumitem}
|
| | \usepackage{cite}
|
| | \usepackage{array}
|
| | \usepackage{url}
|
| |
|
| | \hypersetup{
|
| | colorlinks=true,
|
| | linkcolor=blue!70!black,
|
| | citecolor=blue!70!black,
|
| | urlcolor=blue!60!black,
|
| | }
|
| |
|
| |
|
| | \title{mBA-Profile: Market Profile Construction from Microsecond Bid-Ask Unit Data Using The Path-weighted Gap-filling Approach}
|
| |
|
| | \author{
|
| | \IEEEauthorblockN{
|
| | Rembrant Oyangoren Albeos~
|
| | \href{https://orcid.org/0009-0006-8743-4419}{
|
| | \includegraphics[height=8pt]{ORCID_icon.png}
|
| | }
|
| | \textsuperscript{\hyperref[sec:author_info]{$\dagger$}}
|
| | }
|
| | \IEEEauthorblockA{
|
| | \includegraphics[height=7pt]{ContinualQuasars_icon.png}\hspace{0.4em}Continual Quasars\\
|
| | }
|
| | }
|
| |
|
| | \begin{document}
|
| | \maketitle
|
| |
|
| |
|
| | \begin{abstract}
|
| | Conventional market profile construction collapses raw price data
|
| | directly into a Y-distribution histogram, recording only the price
|
| | levels that were explicitly quoted by the exchange or broker feed.
|
| | This paper presents an alternative approach---termed
|
| | \emph{path-weighted gap-filling}---in which synthetic trail-datapoints
|
| | are inserted at every intermediate unit-level price between
|
| | consecutive observations, producing an extended dataset that yields a
|
| | substantially denser and more continuous market profile. The
|
| | modelling is grounded in microsecond-resolution raw bid/ask unit data
|
| | rather than aggregated TOHLC (time, open, high, low, close) bars or
|
| | volume figures, thereby preserving the highest available fidelity of
|
| | the underlying price process. We demonstrate the approach on a full
|
| | trading day of \texttt{XAUUSDc} data collected from a live trading
|
| | environment, and show that the gap-filled profile eliminates the empty
|
| | bins and sparse regions that afflict raw-unit profiles during fast
|
| | directional moves, producing a more representative picture of
|
| | intraday price dynamics. All resources and code used in this work are available on GitHub at \url{https://github.com/ContinualQuasars/mBA-Profile}.
|
| |
|
| | \end{abstract}
|
| |
|
| | \begin{IEEEkeywords}
|
| | Market profile, unit data, microsecond bid--ask, market microstructure, path-weighted, gap-filling.
|
| | \end{IEEEkeywords}
|
| |
|
| |
|
| | \section{Introduction}
|
| | \label{sec:intro}
|
| |
|
| | \subsection{Market Microstructure and Unit Data}
|
| |
|
| | At the most granular level of market data, financial instruments are
|
| | quoted through discrete data updates: each update represents a change in
|
| | the best bid price, the best ask price, or both
|
| | simultaneously~\cite{hasbrouck2007,ohara1995}. Modern trading
|
| | platforms such as MetaTrader~5 (MT5) record these events with
|
| | millisecond-resolution timestamps, and the data is made available
|
| | through a Python API~\cite{mt5docs}.
|
| |
|
| | The instrument studied in this paper is \texttt{XAUUSDc}, a
|
| | gold CFD (Contract for Difference) traded on a
|
| | standard cent live trading account provided by the Exness broker,
|
| | accessed through the MT5 platform. The \texttt{c} suffix in
|
| | \texttt{XAUUSDc} is an Exness broker account-type indicator
|
| | (denoting a standard cent live account) and has no bearing on the
|
| | XAU price data itself---extracting data from \texttt{XAUUSDc}
|
| | (cent account) or \texttt{XAUUSDm} (dollar account) yields the
|
| | same XAUUSD price data with three decimal places. The minimum price
|
| | increment for this instrument is exactly \$0.001.
|
| | Because the standard lot size for gold is 100 troy ounces, a single
|
| | price movement of \$0.001 corresponds to a profit-or-loss change of
|
| | \$0.10 per standard lot. In this study, the market profile
|
| | bin size is set to \$0.01 (one unit, where 0.01~unit = \$0.01 XAU
|
| | price change), to produce a more
|
| | stable and interpretable distribution.
|
| |
|
| | \subsection{The Market Profile Concept}
|
| |
|
| | A market profile is a rotated histogram of price over a defined time
|
| | window. The concept was introduced by J.~Peter Steidlmayer at the
|
| | Chicago Board of Trade in the 1980s~\cite{steidlmayer1986}.
|
| | Traditionally, a market profile uses 30-minute ``Time Price
|
| | Opportunity'' (TPO) letters stacked at each price level to show where
|
| | price spent the most time during a trading session~\cite{dalton2007}.
|
| | The horizontal axis represents frequency or time density, while the
|
| | vertical axis represents price.
|
| |
|
| | In this study, the concept is adapted to microsecond unit data.
|
| | Instead of 30-minute TPO letters, each histogram bar represents the
|
| | number of data updates (or interpolated price levels, in the
|
| | gap-filled approach) observed at that price. The construction is
|
| | based exclusively on raw bid/ask unit data---not on TOHLC candles or
|
| | volume bars---ensuring that no information is lost to
|
| | aggregation~\cite{ane2000,engle2000}.
|
| |
|
| | \subsection{Paper Outline}
|
| |
|
| | Section~\ref{sec:data} describes the data acquisition pipeline and
|
| | the dataset used. Section~\ref{sec:raw} details the raw unit
|
| | approach. Section~\ref{sec:filled} introduces the gap-filled
|
| | (path-weighted) approach, including a detailed explanation of why
|
| | path-weighting is used. Section~\ref{sec:comparison} provides a
|
| | comprehensive comparison of the two approaches.
|
| | Section~\ref{sec:conclusion} concludes.
|
| |
|
| |
|
| |
|
| | \section{Data Acquisition}
|
| | \label{sec:data}
|
| |
|
| | \subsection{Trading Environment}
|
| |
|
| | The unit data used in this study was collected from a standard cent
|
| | live trading account on the Exness broker, accessed through MetaTrader~5.
|
| | MT5 is a multi-asset trading platform developed by MetaQuotes Software
|
| | Corp.\ that is widely used for forex and CFD
|
| | trading~\cite{mt5docs,metaquotes2024}. Its Python integration exposes
|
| | the function \texttt{copy\_ticks\_range()}, which returns every data
|
| | update within a specified time window as a structured NumPy
|
| | array~\cite{numpy2020}. Each data record contains the following
|
| | fields: a Unix timestamp in seconds, a millisecond-precision timestamp
|
| | providing sub-second resolution, the best bid price, the best ask
|
| | price, and additional metadata including flags indicating which fields
|
| | changed on that particular update.
|
| |
|
| | Although the exposed timestamp has millisecond granularity, the MT5
|
| | documentation describes the system as operating at microsecond
|
| | internal resolution~\cite{mt5docs}; the millisecond field is what is
|
| | exposed through the Python API.
|
| |
|
| | \subsection{Dataset Summary}
|
| |
|
| | The symbol is \texttt{XAUUSDc}. The time range covers the full UTC
|
| | day of February~12, 2026, from 00:00:00 to 23:59:59. The flag used
|
| | retrieves all data updates regardless of whether the bid, ask, or last
|
| | price changed.
|
| |
|
| | The query returned exactly \textbf{393,252~data points}. The first data point was
|
| | recorded at \textbf{2026-02-12 00:00:00.149~UTC} and the last data point at
|
| | \textbf{2026-02-12 23:59:57.820~UTC}. The bid price ranged from a
|
| | low of \textbf{\$4,878.380} to a high of \textbf{\$5,083.750}, a span
|
| | of \textbf{\$205.370} (20,537~units). The ask price ranged from
|
| | \textbf{\$4,878.620} to \textbf{\$5,083.990}, a span of
|
| | \textbf{\$205.370} (20,537~units).
|
| |
|
| | \subsection{Unit Size}
|
| |
|
| | The unit size for \texttt{XAUUSDc} is \textbf{\$0.010} (0.01~unit,
|
| | where 0.01~unit = \$0.01 XAU price change).
|
| | This value is determined by the broker's symbol specification and is
|
| | not configurable by the user. The lowest price resolution of XAU is
|
| | three decimal places: a change of 0.001 corresponds to a
|
| | \$0.001 price movement. Throughout this paper, $\delta = 0.010$
|
| | denotes the unit size, and the bin width used for histogram
|
| | construction equals 0.01~unit ($w = \delta = 0.010$).
|
| |
|
| |
|
| |
|
| | \section{Approach~1: Raw Unit Y-Distribution}
|
| | \label{sec:raw}
|
| |
|
| | \subsection{Methodology}
|
| |
|
| | The raw unit approach constructs a market profile histogram directly
|
| | from the 393,252 observed unit prices without any interpolation or
|
| | modification. The procedure begins by extracting the bid and ask
|
| | columns as separate arrays from the dataset. Histogram bin edges are
|
| | computed starting from
|
| | $\lfloor p_{\min}/\delta \rfloor \cdot \delta - \delta$ up to
|
| | $\lceil p_{\max}/\delta \rceil \cdot \delta + \delta$, spaced by
|
| | exactly $\delta = 0.010$. This ensures that every observed price
|
| | falls cleanly within a bin whose width is exactly 0.01~unit. Bin edges
|
| | are rounded to avoid floating-point precision
|
| | artefacts~\cite{goldberg1991}.
|
| |
|
| | A standard frequency histogram is then computed---the count of data points
|
| | whose price falls within each bin---separately for bid and ask. The
|
| | histogram is plotted horizontally, with price on the vertical axis and
|
| | count on the horizontal axis, creating the conventional market-profile
|
| | appearance where the thickest region corresponds to the price level
|
| | that received the most data updates.
|
| |
|
| | \subsection{Feature Engineering}
|
| |
|
| | The feature engineering pipeline for the raw approach consists of the
|
| | following stages. First, the raw unit data from MT5 (a structured
|
| | array) is converted into a tabular format. The millisecond-precision
|
| | timestamp column is transformed into a UTC-aware datetime
|
| | representation. Next, the datetime values are converted to
|
| | floating-point date numbers suitable for high-performance
|
| | plotting~\cite{matplotlib2007}. This pre-conversion is performed once
|
| | before plotting because passing raw datetime objects to the plotting
|
| | library triggers an internal per-element conversion that is extremely
|
| | slow for arrays of 393,252 elements---the pre-conversion reduces
|
| | plotting time from several minutes to under one minute for the full
|
| | dataset.
|
| |
|
| | The histogram bin edges are constructed using a range function with
|
| | step size equal to $\delta$ and then rounded. For the observed data
|
| | range of \$4,878.380 to \$5,083.990, this produces 20,563 bin edges
|
| | defining 20,562 bins, each exactly \$0.010 wide (0.01~unit).
|
| |
|
| | \subsection{Output}
|
| |
|
| | The output is a $2 \times 2$ subplot figure. The top row displays the
|
| | bid data: a horizontal histogram on the left (blue) and a time-series
|
| | line chart on the right (blue). The bottom row displays the ask data
|
| | in the same layout using red. The two rows share their respective
|
| | Y-axes so that price levels align horizontally between the histogram
|
| | and the line chart.
|
| |
|
| | \begin{figure*}[t]
|
| | \centering
|
| | \includegraphics[width=\textwidth]{raw_ticks_4panel.png}
|
| | \caption{Raw unit Y-distribution histograms (left column) and
|
| | time-series line charts (right column) for bid (top, blue) and ask
|
| | (bottom, red) prices of \texttt{XAUUSDc} on February~12, 2026.
|
| | The dataset contains 393,252 data points. Bin size = 0.01~unit (\$0.010).
|
| | The histogram X-axis shows the count of data points observed at each
|
| | price level.}
|
| | \label{fig:raw_4panel}
|
| | \end{figure*}
|
| |
|
| | \subsection{Interpretation}
|
| |
|
| | In the raw histogram (Figure~\ref{fig:raw_4panel}), the count at each
|
| | price level reflects how many times the market's best bid or best ask
|
| | was updated to that exact price. Levels where the market
|
| | consolidated---spending extended time with many small quote
|
| | updates---accumulate high counts and form the thick horizontal bars in
|
| | the profile~\cite{dalton2007}.
|
| |
|
| | However, when the market jumps from price $A$ to price $B$ in a single
|
| | update without quoting any intermediate level, those intermediate levels
|
| | receive zero counts in the histogram. The raw profile therefore
|
| | contains \emph{gaps}---entire price levels with no
|
| | representation---that correspond to fast directional moves. This is a
|
| | fundamental limitation: the profile faithfully records only what was
|
| | quoted, but it does not capture the price path traversed between
|
| | observations. This motivates the gap-filled approach presented in
|
| | Section~\ref{sec:filled}.
|
| |
|
| |
|
| |
|
| | \section{Approach~2: Gap-Filled (Path-Weighted) Y-Distribution}
|
| | \label{sec:filled}
|
| |
|
| | \subsection{Motivation}
|
| |
|
| | Consider a scenario where the bid price moves from \$5,060.000 to
|
| | \$5,060.100 in a single update. In the raw approach, only two price
|
| | levels---\$5,060.000 and \$5,060.100---register a count, while the
|
| | eight intermediate levels (\$5,060.010 through \$5,060.090) receive no
|
| | representation at all. Yet, under the assumption that price is a
|
| | continuous process sampled at discrete intervals, the price must have
|
| | traversed those eight levels to arrive at
|
| | \$5,060.100~\cite{cont2001,bacry2012}. The gap-filled approach
|
| | addresses this by inserting synthetic trail-datapoints at every
|
| | intermediate unit-level price between consecutive observations,
|
| | thereby constructing a profile that reflects the full path traversed
|
| | by the market rather than only the endpoints of each move.
|
| |
|
| | \subsection{Why Path-Weighting?}
|
| | \label{sec:whypathweight}
|
| |
|
| | The term \emph{path-weighted} refers to the fact that each price
|
| | level's histogram count is weighted by the number of times the price
|
| | path crossed that level, not merely the number of times it was
|
| | explicitly quoted. The rationale for this weighting rests on three
|
| | observations:
|
| |
|
| | \begin{enumerate}[leftmargin=*]
|
| | \item \textbf{Continuity of the price process.} Financial prices
|
| | are fundamentally continuous stochastic processes sampled at
|
| | discrete intervals by the exchange or broker
|
| | feed~\cite{cont2001,bacry2012}. Between any two consecutive
|
| | observations at prices $p_A$ and $p_B$, the underlying price
|
| | process must have traversed every intermediate level. The raw
|
| | profile discards this traversal information; the path-weighted
|
| | profile recovers it.
|
| |
|
| | \item \textbf{Elimination of empty bins.} In the raw profile,
|
| | fast directional moves produce stretches of price levels with zero
|
| | counts, creating discontinuities in the histogram that can mislead
|
| | visual interpretation. Path-weighting ensures that every price
|
| | level between $p_{\min}$ and $p_{\max}$ receives a non-zero count,
|
| | producing a continuous and visually coherent
|
| | profile~\cite{steidlmayer1986}.
|
| |
|
| | \item \textbf{Traversal as a proxy for significance.} A price
|
| | level that is crossed repeatedly---even by fast-moving price
|
| | swings that do not dwell there---is a level that the market
|
| | revisits often. Such levels frequently correspond to support,
|
| | resistance, or areas of high liquidity~\cite{dalton2007,
|
| | steidlmayer1986}. Path-weighting captures this repeated-traversal
|
| | signal, which raw unit counting misses entirely.
|
| | \end{enumerate}
|
| |
|
| | In summary, path-weighting transforms the market profile from a
|
| | histogram of \emph{quoting intensity} into a histogram of
|
| | \emph{traversal frequency}, which is a richer and more informative
|
| | representation of where the market has been.
|
| |
|
| | \subsection{Algorithm}
|
| |
|
| | The gap-filling algorithm operates on pairs of consecutive data points. For
|
| | each pair $(A, B)$ with prices $p_A$ and $p_B$ and timestamps $t_A$
|
| | and $t_B$ (represented as nanosecond integers for computational
|
| | efficiency), the algorithm first computes the signed unit difference
|
| | $\Delta n = \text{round}((p_B - p_A) / \delta)$. If
|
| | $|\Delta n| \le 1$, no interpolation is needed because the two prices
|
| | are adjacent or identical, and the pair is left unchanged. If
|
| | $|\Delta n| > 1$, the algorithm inserts $|\Delta n| - 1$ intermediate
|
| | rows. Each intermediate row $k$ (where $1 \le k < |\Delta n|$)
|
| | receives a price of
|
| | $p_A + k \cdot \text{sgn}(\Delta n) \cdot \delta$ and a timestamp of
|
| | $t_A + \frac{k}{|\Delta n|} \cdot (t_B - t_A)$. The timestamp
|
| | interpolation is linear, distributing the intermediate points evenly
|
| | across the time interval between data points $A$ and
|
| | $B$~\cite{dacorogna2001}.
|
| |
|
| | The implementation is fully vectorised using array operations rather
|
| | than interpreted loops~\cite{numpy2020}. The key operations are
|
| | element repetition (to repeat each source index by the number of units
|
| | in its segment), cumulative summation (to compute segment start
|
| | positions), and element-wise arithmetic for price and timestamp
|
| | interpolation. This vectorised approach processes the entire
|
| | 393,252-point dataset in under 2~seconds on a consumer-grade machine.
|
| |
|
| | The gap-filling is applied independently to the bid series and the ask
|
| | series because the bid and ask prices can move by different amounts on
|
| | the same data update. After gap-filling, the bid series expands from
|
| | 393,252 rows to exactly \textbf{4,614,400~rows} (an expansion factor
|
| | of $11.73\times$), and the ask series expands from 393,252 rows to
|
| | exactly \textbf{4,619,918~rows} (an expansion factor of
|
| | $11.75\times$).
|
| |
|
| | \subsection{Feature Engineering}
|
| |
|
| | The feature engineering pipeline for the gap-filled approach shares the
|
| | initial stages with the raw approach: data fetching, tabular
|
| | conversion, and datetime derivation are identical. The additional
|
| | stage is the gap-filling itself, which produces two new arrays of
|
| | expanded prices and their corresponding interpolated timestamps.
|
| |
|
| | For plotting, the expanded nanosecond timestamps must be converted to
|
| | floating-point date numbers. Because the expanded arrays contain
|
| | approximately 4.6 million elements, calling a datetime conversion
|
| | function on individual objects would be prohibitively slow. Instead,
|
| | the conversion is performed arithmetically: the nanosecond integer is
|
| | divided by $10^9$ to get seconds, then by 86,400 to get fractional
|
| | days since the Unix epoch, and finally offset by the appropriate
|
| | constant to align with the plotting library's date
|
| | system~\cite{matplotlib2007}. This bypasses all object-level datetime
|
| | creation and processes the 4.6 million timestamps in a single
|
| | vectorised operation.
|
| |
|
| | The histogram bins are constructed identically to the raw approach,
|
| | using 0.01-unit (\$0.010) bin widths. Because the gap-filled data has
|
| | the same price range as the raw data (\$4,878.380 to \$5,083.990),
|
| | the number of bins is also 20,562.
|
| |
|
| | \subsection{Output}
|
| |
|
| | The output figure has the identical $2 \times 2$ subplot layout as
|
| | Figure~\ref{fig:raw_4panel}.
|
| |
|
| | \begin{figure*}[t]
|
| | \centering
|
| | \includegraphics[width=\textwidth]{filled_ticks_4panel.png}
|
| | \caption{Gap-filled (path-weighted) Y-distribution histograms
|
| | (left column) and time-series line charts (right column) for bid
|
| | (top, blue) and ask (bottom, red) prices of \texttt{XAUUSDc} on
|
| | February~12, 2026. The bid series contains 4,614,400 data points
|
| | and the ask series contains 4,619,918 data points after
|
| | gap-filling. Bin size = 0.01~unit (\$0.010). The histogram X-axis
|
| | shows the path-weighted count: the number of times each price
|
| | level was traversed between consecutive data points, including synthetic
|
| | intermediate points.}
|
| | \label{fig:filled_4panel}
|
| | \end{figure*}
|
| |
|
| | \subsection{Interpretation}
|
| |
|
| | The gap-filled histogram (Figure~\ref{fig:filled_4panel}) answers a
|
| | fundamentally different question than the raw histogram. Where the
|
| | raw profile asks ``how many times was price \emph{quoted} at this
|
| | level,'' the gap-filled profile asks ``how many times did the price
|
| | \emph{path} cross this level.'' The practical consequence is visible
|
| | in the histogram scale: the raw histogram peaks at counts near 120,
|
| | while the gap-filled histogram peaks at counts near 1,200 (consistent
|
| | with the $\approx 11.7\times$ average expansion factor).
|
| |
|
| | Price regions that were traversed frequently---even if the market did
|
| | not dwell there long enough to generate many raw tick
|
| | updates---accumulate higher counts in the gap-filled profile. The
|
| | large sell-off visible around 16:00~UTC, where the bid price dropped
|
| | from the \$5,050.000 region to the \$4,878.000 region in a
|
| | concentrated burst of activity, produces substantial counts at every
|
| | intermediate price level in the gap-filled profile, whereas those same
|
| | levels appear sparse or empty in the raw profile because the market
|
| | jumped through them in large increments.
|
| |
|
| |
|
| |
|
| | \section{Raw vs.\ Gap-Filled: Comprehensive Comparison}
|
| | \label{sec:comparison}
|
| |
|
| | \subsection{What Each Approach Measures}
|
| |
|
| | The raw approach counts only actual tick updates from the broker's
|
| | data feed. When a price level receives a high count, it means the
|
| | market's best bid or ask was actively updated to that level many
|
| | times. This is a direct measurement of \emph{quoting
|
| | intensity}~\cite{ohara1995}: how frequently market participants were
|
| | placing or modifying orders at that price.
|
| |
|
| | The gap-filled approach counts every tick-level price between
|
| | consecutive updates, including synthetic intermediate points that were
|
| | never explicitly quoted. When a price level receives a high count in
|
| | the gap-filled profile, it means the price \emph{path} crossed that
|
| | level many times---either through actual quoting or through
|
| | interpolation during price jumps. This is a measurement of
|
| | \emph{traversal frequency}.
|
| |
|
| | \subsection{Detailed Comparison}
|
| |
|
| | Table~\ref{tab:comparison} presents a comprehensive side-by-side
|
| | comparison of the two approaches across all relevant variables.
|
| |
|
| | \begin{table*}[t]
|
| | \centering
|
| | \caption{Comprehensive comparison of raw tick vs.\ gap-filled
|
| | (path-weighted) market profile construction for \texttt{XAUUSDc} on
|
| | February~12, 2026.}
|
| | \label{tab:comparison}
|
| | \small
|
| | \begin{tabular}{@{}p{3.8cm}p{5.8cm}p{5.8cm}@{}}
|
| | \toprule
|
| | \textbf{Variable} & \textbf{Raw Tick Profile} & \textbf{Gap-Filled (Path-Weighted) Profile} \\
|
| | \midrule
|
| | Data source &
|
| | Microsecond bid/ask ticks from MT5 &
|
| | Same raw ticks, plus synthetic trail-datapoints \\
|
| | \midrule
|
| | Bid data points &
|
| | 393,252 &
|
| | 4,614,400 ($11.73\times$ expansion) \\
|
| | \midrule
|
| | Ask data points &
|
| | 393,252 &
|
| | 4,619,918 ($11.75\times$ expansion) \\
|
| | \midrule
|
| | Price range &
|
| | \$4,878.380 -- \$5,083.990 &
|
| | \$4,878.380 -- \$5,083.990 (identical) \\
|
| | \midrule
|
| | Bin width &
|
| | $\delta = \$0.010$ (1 tick) &
|
| | $\delta = \$0.010$ (1 tick, identical) \\
|
| | \midrule
|
| | Number of bins &
|
| | 20,562 &
|
| | 20,562 (identical) \\
|
| | \midrule
|
| | Avg.\ count per bin (bid) &
|
| | $393{,}252 / 20{,}537 \approx 19.15$ &
|
| | $4{,}614{,}400 / 20{,}537 \approx 224.7$ \\
|
| | \midrule
|
| | Peak histogram count &
|
| | $\sim$120 &
|
| | $\sim$1,200 \\
|
| | \midrule
|
| | Empty bins in profile &
|
| | Many (fast moves leave gaps) &
|
| | None (all intermediate levels filled) \\
|
| | \midrule
|
| | Profile continuity &
|
| | Discontinuous; sparse in trending regions &
|
| | Continuous; no gaps across entire price range \\
|
| | \midrule
|
| | What is measured &
|
| | Quoting intensity (how often each level was quoted) &
|
| | Traversal frequency (how often price path crossed each level) \\
|
| | \midrule
|
| | Consolidation zones &
|
| | High counts---dense, well-represented &
|
| | Similar to raw (few gaps to fill when moves are small) \\
|
| | \midrule
|
| | Fast directional moves &
|
| | Sparse or empty---underrepresented &
|
| | Well-represented with interpolated traversals \\
|
| | \midrule
|
| | Support/resistance detection &
|
| | Based on quoting density only &
|
| | Enhanced: repeated traversals indicate revisited levels \\
|
| | \midrule
|
| | Interpolation method &
|
| | None &
|
| | Linear timestamp interpolation, tick-step price fill \\
|
| | \midrule
|
| | Computational cost &
|
| | Minimal (direct histogram of raw data) &
|
| | Higher ($\sim$11.7$\times$ more data to process) \\
|
| | \bottomrule
|
| | \end{tabular}
|
| | \end{table*}
|
| |
|
| | \subsection{Superiority of the Gap-Filled Approach}
|
| |
|
| | The gap-filled approach produces a fundamentally more representative
|
| | market profile than the raw tick approach. Its advantages are
|
| | threefold:
|
| |
|
| | \begin{enumerate}[leftmargin=*]
|
| | \item \textbf{Complete price coverage.} The gap-filled profile
|
| | assigns a non-zero count to every price level within the day's
|
| | range, eliminating the misleading empty bins that appear in the
|
| | raw profile during fast moves. This provides a structurally
|
| | complete picture of where the market traded.
|
| |
|
| | \item \textbf{Traversal information.} By counting path crossings
|
| | rather than only explicit quotes, the gap-filled profile captures
|
| | information about how frequently the market revisited each price
|
| | level---information that is entirely absent from the raw profile.
|
| | This traversal signal is directly relevant to identifying dynamic
|
| | support and resistance~\cite{dalton2007}.
|
| |
|
| | \item \textbf{Robustness to feed granularity.} Different brokers
|
| | and feed providers update tick data at different rates. A slower
|
| | feed produces larger jumps between consecutive ticks, which
|
| | creates more gaps in the raw profile. The gap-filled approach is
|
| | robust to this variation because it reconstructs the intermediate
|
| | path regardless of the feed's update frequency.
|
| | \end{enumerate}
|
| |
|
| | The primary trade-off is computational cost: the gap-filling process
|
| | multiplies the dataset by a factor of approximately $11.7\times$ in
|
| | this study, which proportionally increases the time required for
|
| | histogram computation and rendering compared to a typical raw-data
|
| | market profile. For very long time horizons or very volatile
|
| | instruments, this expansion factor could be significantly larger.
|
| |
|
| | \subsection{Interaction with Bin Size}
|
| |
|
| | At the 1-tick bin width ($w = 0.010$) used throughout this study,
|
| | the difference between the raw and gap-filled profiles is maximal
|
| | because gaps in the raw profile (empty bins where no tick was
|
| | observed) are filled in by the gap-filling process. As the bin width
|
| | increases, the practical difference between the two approaches
|
| | diminishes because larger bins tend to capture at least some ticks
|
| | even in the raw profile, and the synthetic intermediate points are
|
| | absorbed into the same bins as the observed ticks. At sufficiently
|
| | large bin widths, the raw and gap-filled histograms become nearly
|
| | indistinguishable~\cite{scott1979}.
|
| |
|
| |
|
| |
|
| | \section{Conclusion}
|
| | \label{sec:conclusion}
|
| |
|
| | This paper presented two approaches to constructing market profiles
|
| | from 393,252 microsecond-resolution bid/ask tick updates of
|
| | \texttt{XAUUSDc} on February~12, 2026, collected from a standard cent
|
| | live trading account on the Exness broker via MetaTrader~5. The raw
|
| | approach counted only observed tick levels, producing a profile that
|
| | reflects quoting intensity. The gap-filled (path-weighted) approach
|
| | interpolated every intermediate price level between consecutive ticks,
|
| | expanding the dataset to 4,614,400 bid rows and 4,619,918 ask rows,
|
| | and producing a profile that reflects path traversal frequency.
|
| |
|
| | The gap-filled approach yields a more complete and informative market
|
| | profile by eliminating empty bins, capturing traversal information,
|
| | and providing robustness to variations in feed update frequency. The
|
| | primary cost of this approach is computational: the gap-filling
|
| | process multiplies the dataset size by a factor of approximately
|
| | $11.7\times$, which may result in slower calculation times compared to
|
| | typical market profile construction from raw data.
|
| |
|
| |
|
| |
|
| |
|
| | \newpage
|
| | \vspace{2em}
|
| | \section*{Author Information}
|
| | \label{sec:author_info}
|
| |
|
| | \begin{center}
|
| | \textbf{Rembrant Oyangoren Albeos}~\href{https://orcid.org/0009-0006-8743-4419}{\includegraphics[height=10pt]{ORCID_icon.png}}
|
| | \end{center}
|
| |
|
| | \noindent\textbf{ORCID:} \url{https://orcid.org/0009-0006-8743-4419}
|
| |
|
| | \noindent\textbf{Email:} algorembrant@gmail.com
|
| |
|
| | \noindent\textbf{Affiliation:} Developer \& Researcher at ConQ
|
| |
|
| | \noindent\textbf{Organization:} Continual Quasars~\includegraphics[height=7pt]{ContinualQuasars_icon.png}
|
| |
|
| | \noindent\textbf{Organization GitHub:} \url{https://github.com/ContinualQuasars}
|
| |
|
| | \noindent\textbf{This Version:} Febuary 14, 2026
|
| |
|
| | \noindent\textbf{GitHub:} \url{https://github.com/ContinualQuasars/mBA-Profile}
|
| |
|
| |
|
| |
|
| | \newpage
|
| | \vspace{20}
|
| | \begin{thebibliography}{99}
|
| |
|
| | \bibitem{steidlmayer1986}
|
| | J.~P. Steidlmayer and K.~Koy,
|
| | \textit{Markets and Market Logic},
|
| | Porcupine Press, 1986.
|
| |
|
| | \bibitem{dalton2007}
|
| | J.~Dalton, E.~Jones, and R.~Dalton,
|
| | \textit{Mind Over Markets: Power Trading with Market Generated
|
| | Information}, Wiley, 2007.
|
| |
|
| | \bibitem{mt5docs}
|
| | MetaQuotes Software Corp.,
|
| | ``MetaTrader~5 Python Integration,''
|
| | \url{https://www.mql5.com/en/docs/python_metatrader5}, 2024.
|
| |
|
| | \bibitem{metaquotes2024}
|
| | MetaQuotes Software Corp.,
|
| | ``MetaTrader~5 Trading Platform,''
|
| | \url{https://www.metatrader5.com}, 2024.
|
| |
|
| | \bibitem{ohara1995}
|
| | M.~O'Hara,
|
| | \textit{Market Microstructure Theory},
|
| | Blackwell Publishers, 1995.
|
| |
|
| | \bibitem{hasbrouck2007}
|
| | J.~Hasbrouck,
|
| | \textit{Empirical Market Microstructure: The Institutions, Economics,
|
| | and Econometrics of Securities Trading},
|
| | Oxford University Press, 2007.
|
| |
|
| | \bibitem{cont2001}
|
| | R.~Cont,
|
| | ``Empirical properties of asset returns: Stylized facts and
|
| | statistical issues,''
|
| | \textit{Quantitative Finance}, vol.~1, no.~2, pp.~223--236, 2001.
|
| |
|
| | \bibitem{bacry2012}
|
| | E.~Bacry, M.~Mastromatteo, and J.-F. Muzy,
|
| | ``Hawkes processes in finance,''
|
| | \textit{Market Microstructure and Liquidity}, vol.~1, no.~1, 2015.
|
| |
|
| | \bibitem{dacorogna2001}
|
| | M.~M. Dacorogna, R.~Gen\c{c}ay, U.~A. M\"{u}ller, R.~B. Olsen, and
|
| | O.~V. Pictet,
|
| | \textit{An Introduction to High-Frequency Finance},
|
| | Academic Press, 2001.
|
| |
|
| | \bibitem{engle2000}
|
| | R.~F. Engle and J.~R. Russell,
|
| | ``Autoregressive conditional duration: A new model for irregularly
|
| | spaced transaction data,''
|
| | \textit{Econometrica}, vol.~66, no.~5, pp.~1127--1162, 1998.
|
| |
|
| | \bibitem{ane2000}
|
| | T.~An\'{e} and H.~Geman,
|
| | ``Order flow, transaction clock, and normality of asset returns,''
|
| | \textit{The Journal of Finance}, vol.~55, no.~5, pp.~2259--2284,
|
| | 2000.
|
| |
|
| | \bibitem{goldberg1991}
|
| | D.~Goldberg,
|
| | ``What every computer scientist should know about floating-point
|
| | arithmetic,''
|
| | \textit{ACM Computing Surveys}, vol.~23, no.~1, pp.~5--48, 1991.
|
| |
|
| | \bibitem{numpy2020}
|
| | C.~R. Harris \textit{et al.},
|
| | ``Array programming with NumPy,''
|
| | \textit{Nature}, vol.~585, pp.~357--362, 2020.
|
| |
|
| | \bibitem{matplotlib2007}
|
| | J.~D. Hunter,
|
| | ``Matplotlib: A 2D graphics environment,''
|
| | \textit{Computing in Science \& Engineering}, vol.~9, no.~3,
|
| | pp.~90--95, 2007.
|
| |
|
| | \bibitem{cmegroup2024}
|
| | CME Group,
|
| | ``Gold Futures Contract Specifications,''
|
| | \url{https://www.cmegroup.com/markets/metals/precious/gold.contractSpecs.html},
|
| | 2024.
|
| |
|
| | \bibitem{scott1979}
|
| | D.~W. Scott,
|
| | ``On optimal and data-based histograms,''
|
| | \textit{Biometrika}, vol.~66, no.~3, pp.~605--610, 1979.
|
| |
|
| | \end{thebibliography}
|
| |
|
| |
|
| |
|
| | \end{document}
|
| |
|
| |
|