The Banking Executive Magazine - February 2025 Issue
ISSUE 193 JANUARY 2025 the BANKING EXECUTIVE 49 FinTech and AI Chornicle They told a story of a company that functioned more like a research lab than a for-profit enterprise and was unencumbered by the hierarchical traditions of China's high-pressure tech industry, even as it became re- sponsible for what many investors see as the latest breakthrough in AI. DIFFERENT PATH Liang was born in 1985 in a rural vil- lage in the southern province of Guangdong. He later obtained com- munication engineering degrees at the elite Zhejiang University. One of his first jobs was running a re- search department at a smart imaging firm in Shanghai. His then-boss, Zhou Chaoen, told state media on Feb. 9 that Liang had hired prize- winning algorithm engineers and op- erated with a "flat management style." At DeepSeek and High-Flyer, Liang has similarly shunned the practices of Chinese tech giants known for rigid top-down management, low pay for young employees and "996" - working from 9 a.m. to 9 p.m. six days a week. Liang opened his Beijing office within walking distance of Tsinghua University and Peking University, China's two most prestigious educa- tion institutions. He regularly delved into technical details and was happy to work alongside Gen-Z interns and recent graduates that comprised the bulk of its workforce, according to two former employees. They also de- scribed usually working eight-hour days in a collaborative atmosphere. "Liang gave us control and treated us as experts. He constantly asked ques- tions and learned alongside us," said 26-year-old researcher Benjamin Liu, who left the company in September. "DeepSeek allowed me to take own- ership of critical parts of the pipeline, which was very exciting." Liang did not respond to questions sent via DeepSeek. While Baidu and other Chinese tech giants were racing to build their con- sumer-facing versions of ChatGPT in 2023 and profit off of the global AI boom, Liang told Chinese media out- let Waves last year that he deliber- ately avoided spending heavily on app development, focusing instead on refining the AI model's quality. Both DeepSeek and High-Flyer are known for paying generously, ac- cording to three people familiar with its compensation practices. At High- Flyer, it is not uncommon for a senior data scientist to make 1.5 million yuan annually, while competitors rarely pay more than 800,000, said one of the people, a rival quant fund manager who knows Liang. The largesse was funded by High-Flyer, which became one of China's most successful quant funds and, even after a government crackdown on the sector, still manages tens of bil- lions of yuan, according to two peo- ple in the industry. COMPUTING POWER DeepSeek's success with a low-cost AI model is based on High-Flyer's decade-long and substantial invest- ment in research and computing power, three people said. The quant fund was an earlier pio- neer in AI trading and a top execu- tive said in 2020 that High-Flyer was going "all in" on AI by re-investing 70% of its revenue, mostly into AI re- search. High-Flyer spent 1.2 billion yuan on two supercomputing AI clusters in 2020 and 2021. The second cluster, Fire-Flyer II, was made up of around 10,000 Nvidia A100 chips, used for training AI models. DeepSeek had not been established at that time, so the accumulation of computing power caught the atten- tion of Chinese securities regulators, said a person with direct knowledge of officials' thinking. "Regulators wanted to know why they need so many chips?" the per- son said. "How they were going to use it? What kind of impact would that have on the market?" Authorities decided not to intervene, in a move that would prove crucial for DeepSeek's fortunes: the U.S. banned the export of A100 chips to China in 2022, at which point Fire- Flyer II was already in operation. Beijing now celebrates DeepSeek, but has instructed it not to engage with the media without approval, ac- cording to a person familiar with Chinese official thinking. Authorities had asked Liang to keep a low-profile because they were wor- ried that too much hype in the media would draw unnecessary attention, the person said. China's cabinet and commerce min- istry, as well as China's securities reg- ulator, did not respond to requests for comment. As one of the few companies with a large A100 cluster, High-Flyer and DeepSeek were able to attract some of China's best research talent, two former employees said. "The key ad- vantage of vast (computing) re- sources is that it allows for large-scale experimentation," said Liu, the former employee. Some Western AI entrepreneurs, like Scale AI CEO Alexandr Wang, have claimed that DeepSeek had as many as 50,000 higher-end Nvidia chips that are banned for export to China. He has not produced evidence for the allegation or responded to Reuters' requests to provide proof. DeepSeek has not responded to Wang's claims. Two former employ- ees attributed the company's success to Liang's focus on more cost-effec- tive AI architecture. The startup used techniques like Mixture-of-Experts (MoE) and multi- head latent attention (MLA), which incur far lower computing costs, its
Made with FlippingBook
RkJQdWJsaXNoZXIy ODkwODk=