文本描述
摘要
论文题目:基于北向资金跟随的多因子LightGBM选股策略研究
论文类型:交易策略设计
专业方向:投资管理
摘要
北向资金是指基于互联互通机制,通过陆股通自香港市场流入并配置 A股
上市公司的资金。自2014年11月17日陆股通渠道开通以来,截止 2021年12
月23日累计流入了16199.09亿元,已持有成分股数量为1485只,已持有A股
总市值达到 749340.60亿元。北向资金的持续流入和出色稳定的回报收益也在
一定程度上说明了 A股广阔的投资前景和未来的国际化趋势。同时,伴随着信
息技术的发展和投资机制的成熟,基于基本面因子和技术因子的量化投资走进
大众视野,通过计算机程序化的交易来获取稳定收益这类方式不断受到投资者
的追捧。
基于上述背景,本文利用传统的基本面因子和技术指标并重点结合北向资
金跟随因子,特别将动态和静态持仓偏好择时结合,并对三种最新的机器学习
算法——经典算法随机森林、LightGBM算法和Catboost算法进行选股建模的对
比分析,在不同投资样本即上证 A股、深证 A股和全市场 A股下进行模型的稳
定性检验,选定最优算法后构建一套能够稳定地获取超额收益的量化投资组合。
最终,本文选取了A股近4年总共19个季度的基本面因子、技术指标因子
和北向资金跟随因子,对数据预处理后,选取 LightGBM算法进行建模,并使用
重要的分类问题评价指标评估算法的表现和稳定性检验,根据预测得到的正例
概率,每期选择上涨概率最高的 5只股票等权重进行建仓调仓,同时对比北向
资金的动静态持仓因子的简单结合与择时结合两种选股策略,最终资金达到了
214.63万人民币,累积总收益率为 114.63%,年化复合收益率为 24.38%,说明
本文构建的量化投资组合有着不错的表现。
关键词:量化投资;多因子选股;机器学习;LightGBM算法
I
Abstract
Abstract
Northbound capital refers to the capital that flows into and allocatesA-share listed
companies from the Hong Kong market through land stock connect based on the
interconnection mechanism. Since the opening of the land stock channel on November
17, 2014, as of December 23, 2021, a total of 1619.909 billion yuan has flowed in, 1485
constituent shares have been held, and the total market value of a shares has reached
74934.060 billion yuan. The continuous inflow of northbound funds and excellent and
stable returns also illustrate the broad investment prospects and future
internationalization trend of a shares to a certain extent. At the same time, with the
development of information technology, quantitative investment based on fundamental
factors and technical factors has come into the public's view. This kind of trading
method, which deals through quantitative methods for the purpose of obtaining stable
income, has also been pursued by investors.
Based on the above background, this paper uses the traditional fundamental
factors and technical indicators and focuses on the North fund following factor,
especially the combination of dynamic and static position preference timing, and makes
a comparative analysis on the stock selection modeling of three latest machine learning
algorithms. After selecting the optimal algorithm, a set of quantitative portfolio that can
stably obtain excess returns is constructed.
Finally, this paper selects the fundamental factors, technical index factors and
northbound capital following factors of a shares for a total of 19 quarters in recent four
years. After data preprocessing, LightGBM algorithm is selected for modeling, and
important classification problem evaluation indexes are used to evaluate the
performance and stability of the algorithm. According to the predicted positive example
probability, In each period, five stocks with the highest rising probability and other
weights were selected to build and adjust positions. The final capital reached RMB
2.463 million, the cumulative total return was 114.63%, and the annualized compound
return was 24.38%, indicating that the quantitative portfolio constructed in this paper
has a good performance.
Key Words: quantitative investment; Multi factor stock selection; Machine learning;
LightGBM algorithm
II
目录
目录
第 1章绪论..................................................................................................................1
1.1研究的背景.....................................................................................................1
1.2研究的目的和意义.........................................................................................2
1.3研究的内容、方法和技术路线.....................................................................3
1.3.1研究内容..............................................................................................3
1.3.2研究方法..............................................................................................4
1.3.3技术路线...............................................................................................5
1.4本文的创新点.................................................................................................6
第 2章相关模型回顾与文献综述...............................................................................7
2.1相关模型回顾..................................................................................................7
2.1.1多因子选股模型...................................................................................7
2.2.2机器学习算法.......................................................................................8
2.2文献综述........................................................................................................11
2.2.1北向资金与股价波动.........................................................................11
2.2.2量化投资策略与多因子选股.............................................................12
2.2.3机器学习在量化投资中的应用.........................................................12
第 3章传统因子数据的选取及预处理.....................................................................14
3.1数据来源........................................................................................................14
3.2传统细分因子选取与筛选............................................................................15
3.3数据预处理....................................................................................................18
3.2.1缺失值处理.........................................................................................18
3.2.2极端值处理.........................................................................................20
3.2.3归一化处理.........................................................................................20
3.2.4细分因子主成分分析.........................................................................21
3.4本章小结........................................................................................................22
1
。。。以下略