<!DOCTYPE html><html lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" style="font-size:16px;"><head></head><head><meta charset="utf-8"/><!--[if !mso]><!--><meta http-equiv="X-UA-Compatible" content="IE=edge"/><!--<![endif]--><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="x-apple-disable-message-reformatting"/><meta name="format-detection" content="telephone=no,address=no,email=no,date=no,url=no"/><meta name="color-scheme" content="light"/><meta name="supported-color-schemes" content="light"/><title>Mixture-of-Recursions, How Many Instructions Can LLMs Follow At Once? and more</title><!--[if mso]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--><style> :root { color-scheme: light; supported-color-schemes: light; } body { margin: 0; padding: 0; min-width: 100%!important; -ms-text-size-adjust: 100% !important; -webkit-transform: scale(1) !important; -webkit-text-size-adjust: 100% !important; -webkit-font-smoothing: antialiased !important; } .body { word-wrap: normal; word-spacing:normal; } table.mso { width: 100%; border-collapse: collapse; padding: 0; table-layout: fixed; } img { border: 0; outline: none; } table { mso-table-lspace: 0px; mso-table-rspace: 0px; } td, a, span { mso-line-height-rule: exactly; } #root [x-apple-data-detectors=true], a[x-apple-data-detectors=true], #MessageViewBody a { color: inherit !important; text-decoration: inherit !important; font-size: inherit !important; font-family: inherit !important; font-weight: inherit !important; line-height: inherit !important; } span.MsoHyperlink { color: inherit !important; mso-style-priority: 99 !important; } span.MsoHyperlinkFollowed { color: inherit !important; mso-style-priority: 99 !important; } .a { background-color:#dedede; } .b { background-color:#2a2a2a; } .c { background-color:#ffffff; } .d { background-color:#fff0c8; } .d2 { background-color:#FFFFFF; } .d3 { background-color:#FFFFFF; } h1 a { text-decoration:none;color:#2C81E5;font-style:italic; } h2 a { text-decoration:none;color:#2C81E5;font-style:italic; } h3 a { text-decoration:none;color:#2C81E5;font-style:italic; } h4 a { text-decoration:none;color:#2C81E5;font-style:italic; } h5 a { text-decoration:none;color:#2C81E5;font-style:italic; } h6 a { text-decoration:none;color:#2C81E5;font-style:italic; } h1, h1 a, h2, h2 a, h3, h3 a, h4, h4 a, h5, h5 a, h6, h6 a, ul, li, ol, p, p a { margin: 0;padding: 0; } h1 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:700;font-size:28px;color:#2A2A2A;line-height:42px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } h2 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:700;font-size:24px;color:#2A2A2A;line-height:36px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } h3 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:400;font-size:20px;color:#2A2A2A;line-height:30px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } h4 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:400;font-size:18px;color:#2A2A2A;line-height:27px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } h5 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:400;font-size:16px;color:#2A2A2A;line-height:24px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } h6 { font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif;font-weight:400;font-size:14px;color:#2A2A2A;line-height:21px;padding-bottom:4px;padding-top:16px;mso-margin-top-alt:16px;mso-margin-bottom-alt:4px } p { font-family:'Georgia','Times New Roman',serif;font-weight:400;color:#2D2D2D;font-size:16px;line-height:24px;padding-bottom:8px;padding-top:8px;mso-margin-top-alt:8px;mso-margin-bottom-alt:8px; } p a, .e a, ul a, li a, .h a, .h2 a, .h3 a { word-break:break-word;color:#2C81E5 !important;text-decoration:none;font-style:italic; } p a span, .e a span, ul a span, li a span { color: inherit } p .bold { font-weight:bold;color:#2D2D2D; } p span[style*="font-size"] { line-height: 1.6; } .f p { font-size:12px;line-height:15px;color:#2D2D2D;padding:0; } .f p a { color:#2D2D2D !important; } .g p { font-family:'Helvetica',Arial,sans-serif;font-size:14px;line-height:20px;font-weight:normal;margin:0; } .g p a { text-decoration: underline; } .i p { font-family:'Helvetica',Arial,sans-serif;line-height:23px;font-size:15px;color:#2D2D2D; } .i p a { color:#2D2D2D !important; } .i2 p { font-family:'Helvetica',Arial,sans-serif;line-height:23px;font-size:15px;color:#2D2D2D; } .i2 p a { color:#2D2D2D !important; } .i3 p { font-family:'Helvetica',Arial,sans-serif;line-height:43px;font-size:24px;color:#2D2D2D; } .i3 p a { color:#2D2D2D !important; } .h p a { color:#595959 !important; } .h2 p a { color:#595959 !important; } .h3 p a { color:#595959 !important; } .f p a, .i p a, .i2 p a, .i3 p a, .h p a, .h2 p a, .h3 p a { text-decoration:underline; } .j { border-top:3px solid #ffeb2d; } .k p { padding-left:15px;padding-bottom:0px;padding-top:6px;mso-margin-top-alt:6px;mso-margin-bottom-alt:0px;mso-margin-left-alt:15px; } .o { background-color:#FFFFFF;border:1px solid #F1F1F1;border-radius:5px; } .o p { font-family:'Helvetica',Arial,sans-serif;padding:0px;margin:0px; } .l p, .l p a, .l a { font-size:14px;line-height:20px;font-weight: bold;color:#2D2D2D;padding-bottom:6px;mso-margin-bottom-alt:6px;text-decoration:none; } .m p, .m p a { font-size:13px;line-height:18px;font-weight:400;color:#2D2D2D;padding-bottom:6px;mso-margin-bottom-alt:6px;text-decoration:none; } .n p, .n p a { font-size:12px;line-height:17px;font-weight:400;color:#2D2D2D;padding-bottom:6px;mso-margin-bottom-alt:6px;text-decoration:none; } .p { background-color:#FFFFFF;max-width:520px;border:1px solid #E1E8ED;border:1px solid rgba(80, 80, 80, 0.3);border-radius:5px; } .q { font-size:16px;font-family:Helvetica,Roboto,Calibri,sans-serif !important;border:1px solid #e1e8ed;border:1px solid rgba(80, 80, 80, 0.3);border-radius:10px;background-color:#FFFFFF; } .q p { font-size:16px;font-family:system-ui,Helvetica,Roboto,Calibri,sans-serif !important;color:#222222;padding:4px 0; } .r { border:1px solid #E1E8ED !important;border-radius:5px; } .s p { font-size: 14px; line-height: 17px; font-weight: 400; color: #697882; text-decoration: none; } .t p { font-family:'Helvetica',Arial,sans-serif;font-size:12px;line-height:18px;font-weight:400;color:#000000;font-style:italic;padding:4px 0px 0px; } .v { border-radius:10px;border:solid 0px #DFD150;background-color:#2C81E5;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;color:#FFFFFF; } .v a { text-decoration:none;display:block;color:#FFFFFF; } .w p { font-size:12px;line-height:15px;font-weight:400;color:#FFFFFF; } .w p a { text-decoration: underline !important;color:#FFFFFF !important; } ul { font-family:'Helvetica',Arial,sans-serif;margin:0px 0px 0px 25px !important;padding:0px !important;color:#2D2D2D;line-height:24px;list-style:disc;font-size:16px; } ul > li { font-family:'Helvetica',Arial,sans-serif;margin:10px 0px 0px 0px !important;padding: 0px 0px 0px 0px !important; color: #2D2D2D; list-style:disc; } ol { font-family:'Helvetica',Arial,sans-serif;margin: 0px 0px 0px 25px !important;padding:0px !important;color:#2D2D2D;line-height:24px;list-style:decimal;font-size:16px; } ol > li { font-family:'Helvetica',Arial,sans-serif;margin:10px 0px 0px 0px !important;padding: 0px 0px 0px 0px !important; color: #2D2D2D; list-style:decimal; } .e h3, .e p, .e span { padding-bottom:0px;padding-top:0px;mso-margin-top-alt:0px;mso-margin-bottom-alt:0px; } .e span, .e li { font-family:'Helvetica',Arial,sans-serif;font-size:16px;color:#2D2D2D;line-height:24px; } .rec { font-family: ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, "Noto Sans", sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji" !important; } .rec__button:hover { background-color: #f9fafb !important; } .copyright a {color: inherit !important; text-decoration: none !important; font-size: inherit !important; font-family: inherit !important; font-weight: inherit !important; line-height: inherit !important;} .txt_social p { padding: 0; word-break: break-all; } .table, .table-c, .table-h { border: 1px solid #C0C0C0; } .table-c { padding:5px; background-color:#FFFFFF; } .table-c p { color: #2D2D2D; font-family:'Helvetica',Arial,sans-serif !important;overflow-wrap: break-word; } .table-h { padding:5px; background-color:#F1F1F1; } .table-h p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important;overflow-wrap: break-word; } @media only screen and (max-width:667px) { .aa, .w100pc { width: 100% !important; } .bb img { width: 100% !important; height: auto !important; max-width: none !important; } .cc { padding: 0px 8px !important; } .ee { padding-top:10px !important;padding-bottom:10px !important; } .ff ul, .ff ol { margin: 0px 0px 0px 10px !important;padding: 0px !important; } .ff li { margin:10px 0px 0px 10px !important; } .r {height:140px !important;} .s p { font-size:13px !important;line-height:15px !important; } .mob-hide {display:none !important;} .mob-show {display: block !important; width: auto !important; overflow: visible !important; float: none !important; max-height: inherit !important; line-height: inherit !important;} .mob-stack {width:100% !important;display:block !important;} .mob-w-full {width:100% !important;} .mob-block {display:block !important;} .embed-img {padding:0px 0px 12px 0px !important;} .socialShare {padding-top:15px !important;} .rec { padding-left:15px!important;padding-right:15px!important; } .bodyWrapper { padding:7px 4px 7px 4px !important; } .social-mobile {float:left !important;margin-top:10px !important;} } @media screen and (max-width: 480px) { u + .a .gg { width: 100% !important; width: 100vw !important; } .tok-heart { padding-top:75% !important; } .tok-play { padding-top: 250px !important; } } @media screen and (max-width: 320px) { .tok-heart { padding-top:65% !important; } } .u { border: 1px solid #CACACA !important; border-radius: 2px !important; background-color: #ffffff !important; padding: 0px 13px 0px 13px !important; font-family:ui-sans-serif,system-ui,-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,"Helvetica Neue",Arial,"Noto Sans",sans-serif !important;font-size: 12px !important; color: #767676 !important; } .u a { text-decoration: none; display: block !important; color: #767676 !important; margin: 0px !important; } .u span, .u img { color: #767676 !important;margin:0px !important; max-height:32px !important;background-color:#ffffff !important; } </style><!--[if mso]><style type="text/css"> h1, h2, h3, h4, h5, h6 {font-family: Arial, sans-serif !important;} body, table, td, p, a, span {font-family: Arial, sans-serif !important;} sup { font-size: 100% !important;vertical-align: .5em !important;mso-text-raise: -1.5% !important;line-height: 0 !important; } ul { margin-left:0px !important; margin-right:10px !important; margin-top:20px !important; margin-bottom:20px !important; } ul li { margin-left: 0px !important; mso-special-format: decimal; } ol { margin-left:0px !important; margin-right:10px !important; margin-top:20px !important; margin-bottom:20px !important; } ol li { margin-left: 0px !important; mso-special-format: decimal; } li.listItem { margin-left:15px !important; margin-top:0px !important; } .paddingDesktop { padding: 10px 0 !important; } .edm_outlooklist { margin-left: -20px !important; } .embedImage { display:none !important; } </style><![endif]--><style> @font-face { font-family: 'Open Sans'; font-style: normal; font-weight: 700; font-display: swap; src: url('https://fonts.gstatic.com/s/opensans/v40/memSYaGs126MiZpBA-UvWbX2vVnXBbObj2OVZyOOSr4dVJWUgsg-1x4gaVIUwaEQbjA.woff2') format('woff2'); } @font-face { font-family: 'Open Sans'; font-style: italic; font-weight: 700; font-display: swap; src: url('https://fonts.googleapis.com/css2?family=Open+Sans:ital,wght@1,700&display=swap') format('woff2'); } </style></head><body class="a" style="margin:0px auto;padding:0px;word-wrap:normal;word-spacing:normal;background-color:#dedede;"><div role="article" aria-roledescription="email" aria-label="email_name" lang="en" style="font-size:1rem"><div style="display:none;max-height:0px;overflow:hidden;"> Dive into the latest AI research and industry news, featuring limit testing how many instructions LLMs can follow at once, Mixture-of-Recursions, and monitoring LLM using their CoT  ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ </div><table role="none" width="100%" border="0" cellspacing="0" align="center" cellpadding="0" class="gg"><tr><td align="center" valign="top"><table role="none" width="670" border="0" cellspacing="0" cellpadding="0" class="aa" style="width:670px;table-layout:fixed;"><tr><td class="bodyWrapper" align="center" valign="top" style="padding:7px 7px 7px 7px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top" style="border-width:0px 0px 0px 0px;border-style: solid; border-color: #2a2a2a;border-radius:10px 10px 0px 0px;background-color:#ffffff;" class="c"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr id="header"><td style="padding:28px 28px 0px 28px;"><div style="padding-top:0px;padding-right:0px;padding-bottom:20px;padding-left:0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td class="f" align="right" valign="top"><p> July 22, 2025 | <a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.c6q0w4g5sodbtO4I1B_pxSdB5RCIH6yy1Fm1CYma3Exz4dpDs5CkqSdQn0eFxukITy3UFqqQPX0ntEgT2foZcdC2BJuufgTpV_5CUgr8EPTGQ8yS-z3Y6PvzkuuXcQfNtSK7x_tUvBj2-DIVioDy_1WMk3XJPumCsT3SiMlOHdLLLgbhzwSTiOWMYqJGo1lq2itjFvCqKIfl54_QUcB51sc62yGJVChSREdnyOr_LLPXJ2co-TkC_h8DWptiO4KZzN-GdcC_aWCYhs4sudaDMpKM9r8UboJktdA6c6IBM1EO-kpQbEoks8nEpJzlLFb5DP1uo7ZCTl7uduzcccGBsF8KB0JP2nlEceMXx1D9oZYqtHJoKvn24hoqHIyaqHrVbRkV5vSFZDwxzoj5WPWlTIuNXPOcF-4AEI39rCB4J7Dk7z51JWizvHHsiz6BiVvnJzbFwwPjRTFbzCnv4LaderKxQ5LIG6MXf-uZoLifjCe3cTJ0FnM0LeM3iNAOoqsCMtkLv8CVnUUc0lznwHQrO8cPgV-EX9-rXI5-vJB9wdJ8a1YJ_0qORwxagnT0_IUjdX---CJ7CCHW3p3t7IbGZ6X0OJbTHVjpufyHoR6jLW4GlaIjbXXncPO5ZmP5xNva3PEM_aWfrEmand8mzhLxu5d1gkiZSa5hHLY9zlDz8FtrTuTarISy66fEHM-8NYpt9C_oN_jgVDnZ_s8QE9XjwPFsgKGMXe9W6SEwsqCsGfmY8KELctzT5jdA81TRcqS1/4ie/eQkcXh7ZTkONm8QCVXBBnw/h0/h001.tAhuqU29TA_sVvogb06ng0cM1j8uZ8FtIV2BHuhT1JU">Read Online</a></p></td></tr><tr><td class="dd" align="center" valign="top" style="padding:15px 0;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top"><h1 style="text-align:left;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;font-weight:Bold;font-size:32px;color:#2A2A2A;padding:2px 0;line-height:38px;"> Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation </h1></td></tr></table></td></tr><tr><td style="height:0px;width:0px;"><div style="height:1px;" data-open-tracking="true"> <img src="https://elink4f7.mail.bycloud.ai/ss/o/u001.3wmUuY8gEWd4_869a_eXcg/4ie/eQkcXh7ZTkONm8QCVXBBnw/ho.gif" alt="" width="1" height="1" border="0" style="height:1px !important;width:1px !important;border-width:0 !important;margin-top:0 !important;margin-bottom:0 !important;margin-right:0 !important;margin-left:0 !important;padding-top:0 !important;padding-bottom:0 !important;padding-right:0 !important;padding-left:0 !important;"/> </div></td></tr></table></div></td></tr><tr id="content-blocks"><td class="email-card-body" align="center" valign="top" style="padding-bottom:28px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td id="nov-18-th-nov-24-th-33-latest-ai-re" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:normal;padding:0px 28px;text-align:left;"><h6 style="color:#2A2A2A;font-weight:normal;mso-line-height-alt:87.5%;"><i>July 14th ~ Jul 20th</i><br><i>#65 Latest AI Research Explained Simply</i></h6></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="#222222" style="background-color:#222222;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"></p></td></tr></table></td></tr></table></td></tr><tr><td id="industry-news-in-1-line" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">🗞️ Industry News in 1 Line</h2></td></tr><tr><td style="padding-bottom:12px;padding-left:50px;padding-right:40px;padding-top:12px;" class="ee"><div style="margin-left:0px;" class="edm_outlooklist"><ol start="1" style="list-style-type:decimal;margin:0px 0px;padding:0px 0px 0px 0px;"><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;">♥ 1.8k</span></span> <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.fUNb4GdFo9D3F8WuLArtoUdhNM7gmNK8l9WQAevR6iu4WSjKg8IW0dS2D7PoP2TMCzVi0KEm9dOPKWk0ckXNEkrO5PpNtTcZJdV9NHMIVAWlnJNj9m2OfAHGx086eiHGPsQ3benNKrfw94l3W8NKBA/4ie/eQkcXh7ZTkONm8QCVXBBnw/h1/h001.euAFldGNCk82oi6g6-UryR8-V_bbOm8r4qmenMR9bXk" target="_blank" rel="noopener noreferrer nofollow"><span>The ARC Prize has launched ARC-AGI-3</span></a>, a new benchmark using interactive games designed to test advanced AI agent capabilities that are difficult for today's best models. The API is available openly and a 30-day agent-building competition is being hosted in partnership with Hugging Face. <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.tLfGW26lAwaS9gFg17HSoH-_-AMhAPpS0II1hbwbuQjr8UjbR_YTeH8uB6IQ3kUesOemVXXxLNPhIWCa27cBn89hZLtxdppvZgXLesAjEwyN8hf0jRQyfdp-MCQd5iE9Kii2ED1p12iPd1_2oOZ7Vw/4ie/eQkcXh7ZTkONm8QCVXBBnw/h2/h001.Mfo8eDM3KgQ99OXb2UcqwE6X021F_FoD91DcyiaQzRA" target="_blank" rel="noopener noreferrer nofollow"><span>Try the ARC-AGI-3 game (benchmark) for free</span></a> in your browser. </p><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:524px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/c4fd4496-e435-4586-ab09-39349d096486/image.png?t=1753195159" alt="" height="auto" width="524" style="display:block;width:100%;" border="0"/></td></tr></table></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;">♥ 13k</span></span> OpenAI has started rolling out its new ChatGPT agent to paid subscribers. This new tool is designed to perform complex, multi-step tasks using a suite of capabilities including a <b>browser, terminal, and code execution</b>. Read the <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DIqyAo9xTeoWriogq2VlWeUmi9WmFR4pnC4wMSHAHOFr9AtQr8yPwBnvkBLtg4IUpMvqDTHkynzV5FIdgd82ThDCWJSJ72w41QZa7f0n1IZdIMWlLh1Eu7pqLCTlIFIlJkxYa3gyn5POl-epUE0uQ8_Ygid8FpMyi-BK9WBusAo/4ie/eQkcXh7ZTkONm8QCVXBBnw/h3/h001.qi33RtDNytYKqq_CBSM_D29Hm2M5ySToYBum0qCe3UE" target="_blank" rel="noopener noreferrer nofollow"><span>ChatGPT agent System Card</span></a> to learn more. </p><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7d2f8465-3951-4db6-9f3b-1af2e0eec2f4/image.png?t=1753195663" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr></table></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;">♥ 7.3k</span></span> OpenAI is internally testing a reasoning model which has achieved <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.1muhFWIqieRYpaJ-FbWSCWbjsxIfppvztLQMQbaeyEQM6bCZ8WBphypCZTUYiysbrJ2h4BzhoqD5wfvkQpG5WIvYEQHttki3iujy6rbDM3FUMvrMfNEX9taOJ5ejSLehwLY7kZr2EAF1D9D_YBQdCQ/4ie/eQkcXh7ZTkONm8QCVXBBnw/h4/h001.IN9tuusz3nD3cYPhIK7f67NXv8bZononeQn1ETlNTj0" target="_blank" rel="noopener noreferrer nofollow"><span><b>gold medal-level performance</b></span></a><b> on the International Math Olympiad</b>. It successfully solved 5 out of 6 problems by generating natural language proofs under human-like exam conditions (without specific IMO prompts or training). OpenAI highlighted that this marks a significant leap in AI's creative reasoning capabilities and clarified that this specific research model is not the upcoming GPT-5 and <b>will not be released for several months</b>. On the other hand, <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.9ggl6Mt0xphuuMReR5gVpSbxj5I3gsY0nXQWe4O2XZe5ThD0eAC6Vvp5GlomnagifXU7QmMj5J4sstOmU5iSX-6MWJF3-5Qnlx11-zFZ8EPCLsykdUwywIw36Rg--t8VdaTwIWLV8K26LTcaGEyjYSFMhCVuQ70WgL10HZJ4TQK-fEBp0QanmbKyPUwRubyAt8-IzBFEQJlcIsyuYRX3_p1rTJWC26iy5dC0iYk5wacMccx9s5Ly4CDiSxAgt7Pm9t0rCi_VgR_burd1nK0ghV67jH8ITgzGd301lN6SHzk/4ie/eQkcXh7ZTkONm8QCVXBBnw/h5/h001.wc7ancfRimRXRuCA66bpeo8YsS8rpInZprkK_klfOzo" target="_blank" rel="noopener noreferrer nofollow"><span><b>Gemini also achieved Gold medal-level performance</b></span></a>, it was completed in LEAN so it was not just in natural language. However, it was prompted with a lot of IMO specific math examples. </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;">♥ 3.3k</span></span> Qwen has updated its 235B model and announced that they will be moving away from a "hybrid thinking mode" to training separate Instruct and Thinking models for higher quality. The new <code>Qwen3-235B-A22B-Instruct-2507</code> model is now publicly available and offer improved overall capabilities and better performance on agent tasks. Download Qwen3-235B weights from <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.CxDkkVpJsBdVoe83c_tBWu5HHpecwStOph8cXvproQ8L-Abt6kJleNkHA-VgHF3IsXaNjj4eDZRkPknXvwH2VZ_DYoosHn4d96Phc8fMg4KDe4RF7nHzFNj_YQpAAstf_aoVIuRqqRM78hPatM1LZ8JJJn9-JuY0PcyiZ0bWVRw/4ie/eQkcXh7ZTkONm8QCVXBBnw/h6/h001.lGVY5HHQMwBjS27_MDsApL_7mwisA7Jim_Wmt7NNzQ8" target="_blank" rel="noopener noreferrer nofollow"><span>Hugging Face</span></a> or <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.c6q0w4g5sodbtO4I1B_pxbBqUjAzI8c4yNhiTrsgEvMUoEGn8XL4B05KzVeeyouQEYgoeVveOS7iFU_cma7kVx4APSKawYeyrW9gwhh2BJuyluCjx4bnvUTkvye7_YqqEOaflKN85swXWOQN9Umhd2mfmNfik-gOcGGN-rS9ctY/4ie/eQkcXh7ZTkONm8QCVXBBnw/h7/h001.5vUajkvOLQBhChi0gx_4cl8OXg8K6aUMMIn8YetW_gI" target="_blank" rel="noopener noreferrer nofollow"><span>ModelScope</span></a> today. </p><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7748cfda-13a5-4696-8f2b-91dd214fedd9/4c0c9ba6491643b3bdc0103c570e880634c97ff57d50cfe124481410548c186f.jpeg?t=1753196384" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr></table></li></ol></div></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="#222222" style="background-color:#222222;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"></p></td></tr></table></td></tr></table></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="transparent" style="background-color:transparent;border-color:#2C81E5;border-style:solid;border-width:5px;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;"><span style=""><a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DUiN96-Eq7pUHzwEhy5j29rweNtSHTMPdCBNsuu15hStNWn7Cp-_z7KAfyQIe6He4nQT_96lcVXX1db5XGEmA2fwDiqKQU_6H5ubd8zR_R7fdhqA0y5gF6aRHwtvIqQk/4ie/eQkcXh7ZTkONm8QCVXBBnw/h8/h001.9-cjh5CxqRGxKbzIExXCRYfd_mdPJ5R8pZgtIF3iqro" target="_blank" rel="noopener noreferrer nofollow"><span>findmypapers.ai</span></a></span><span style=""> - find THE AI papers semantically </span>👀</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="color:rgb(34, 34, 34);font-family:Georgia, "Times New Roman", serif;font-size:16px;">Indexing over 450,000 papers now, </span><span style=""><a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DUiN96-Eq7pUHzwEhy5j29rweNtSHTMPdCBNsuu15hStNWn7Cp-_z7KAfyQIe6HeOFOoVeJhHHgfKKiaK6i5-Ic350jZ21CxuRI2MkUHelTFWfhrsXZS1ZYo28G1EYDp/4ie/eQkcXh7ZTkONm8QCVXBBnw/h9/h001.dOJedUgCGuNvTiTsk-A5TismpVNm7NnNVS_3JLP413E" target="_blank" rel="noopener noreferrer nofollow"><span>findmypapers.ai</span></a></span><span style=""> lets you find AI papers semantically, the best research partner you could ever ask for.</span></p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DUiN96-Eq7pUHzwEhy5j29rweNtSHTMPdCBNsuu15hStNWn7Cp-_z7KAfyQIe6Heu4XfgVZCIYSF5UjwZDASpyWFcfhFkP2TFOOwHXWr6iDTPPq42Rx9e09rZOxXYfGf/4ie/eQkcXh7ZTkONm8QCVXBBnw/h10/h001.37NBV6A2PZ7H2__XraMJVBfPHFdn_Yg-MydMLFw_6nM" rel="noopener noreferrer nofollow" style="text-decoration:none;" target="_blank"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bfba51a0-ee8e-42d5-8b45-51ddfd5ebe33/image.png?t=1748369227" alt="" height="auto" width="656" style="display:block;width:100%;border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" border="0"/></a></td></tr></table></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="">We are also developing a new feature </span><span style=""><b>Scout</b></span><span style="">: a semantic arXiv paper alerts that can filter through universities, now in early beta!</span></p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/1818f776-3c25-4ecb-a900-1315154378fe/image.png?t=1753205342" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p><span style="">POV: creating a new scout </span></p></td></tr></table></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/cf6c940f-09ef-40bb-953b-47cf1445a005/image.png?t=1753205378" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p><span style="">you can also select specific institutions! </span></p></td></tr></table></td></tr><tr><td align="center" valign="top" style="padding-bottom:14px;padding-left:28px;padding-right:28px;padding-top:14px;text-align:center;width:100%;word-break:break-word;" class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" align="center" style="margin:14px auto 14px auto;"><tr><td align="center" valign="middle" height="44.75" style="height:44.75px;background-color:#2C81E5;border-color:#DFD150;border-radius:10px 10px 10px 10px;border-style:solid;border-width:0px 0px 0px 0px;color:#FFFFFF;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DUiN96-Eq7pUHzwEhy5j29rweNtSHTMPdCBNsuu15hStNWn7Cp-_z7KAfyQIe6He9SUNC4nbJ6LlgkqlU9UDXwVB3vJvizYgRJ7hNxQcy0ODHSr63C3LoSd725TYLfvP/4ie/eQkcXh7ZTkONm8QCVXBBnw/h11/h001.UKCrKnqs1UM90ugYOHJ_p4cjkd-dRMhHc3iAJLJoeI0" target="_blank" rel="noopener noreferrer nofollow" style="color:#FFFFFF;display:block;font-size:16px;font-size:16px;font-weight:normal;padding:0px 14px;padding:14px 14px 14px 14px;text-decoration:none;"> Check Out FMP </a></td></tr></table></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="font-size:0.8rem;font-weight:500;"><b><a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.tLfGW26lAwaS9gFg17HSoGymQ3NNPtd5dE5MV_8UgjIDFPVXngz8pvQBldSW42yhUe_Qiq6DgEPMEBuPL9yfRpXelTiuu2kS8pLFvsoem_XoZoy_n13sTKUhZIbl0VH6/4ie/eQkcXh7ZTkONm8QCVXBBnw/h12/h001.-CxjAzPZV6UUjRYFKzXZ44haE9Oj9Qq83ljxO1qIa84" target="_blank" rel="noopener noreferrer nofollow"><span>Advertise with The AI Timeline! </span></a></b></span></p></td></tr></table></td></tr></table></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="#222222" style="background-color:#222222;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"></p></td></tr></table></td></tr></table></td></tr><tr><td id="mixtureof-recursions-learning-dynam" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><i>Bae et al. [KAIST AI, Mila, Google Cloud, Google DeepMind, Google Research, Université de Montréal]</i></p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;"> ♥ 3.1k </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span><span style="background-color:#e0e0e0;"><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> Transformers </span></span></p></td></tr><tr><td id="introduction-to-mixtureof-recursion" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:normal;padding:0px 28px;text-align:left;"><h3 style="color:#2A2A2A;font-weight:normal;mso-line-height-alt:125.0%;">Introduction to Mixture-of-Recursions</h3></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Large language models deliver impressive capabilities but come with hefty computational and memory costs. Training and deploying them is still expensive, and existing efficiency methods typically address either parameter sharing or adaptive computation, but not both. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> This paper introduces <b>Mixture-of-Recursions (MoR)</b>, a new framework that unifies these approaches in a single Recursive Transformer. By dynamically adjusting how deeply each token "thinks," MoR cuts redundant computation while reusing parameters across layers, and promises large-model quality without the large-model cost. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/765a67d2-52cc-4d8d-8f8a-7a6a4c5fae12/mor_fig1.png?t=1753194138" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr></table></td></tr><tr><td id="inner-working-of-mixtureof-recursio" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:normal;padding:0px 28px;text-align:left;"><h3 style="color:#2A2A2A;font-weight:normal;mso-line-height-alt:125.0%;">Inner working of Mixture-of-Recursions</h3></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> MoR tackles efficiency through three interconnected mechanisms. First, it reuses a shared stack of layers across multiple recursion steps, slashing parameter counts. For example, a model with nine layers might cycle through three shared blocks repeatedly, reducing unique weights by two-thirds. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Second, lightweight routers assign token-specific recursion depths. In <b>expert-choice routing</b>, tokens compete for limited slots at each recursion step, with top performers advancing deeper. Alternatively, <b>token-choice routing</b> commits each token to a fixed depth upfront, avoiding information leakage but requiring load-balancing techniques. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/ee90f266-8473-402d-9824-cfd720b66c75/image.png?t=1753194198" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p>Parameter-sharing strategies in Recursive Transformers.</p></td></tr></table></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> These routing decisions enable the third innovation: optimized key-value (KV) caching. With <b>recursion-wise caching</b>, MoR stores only the KV pairs of active tokens at each step, trimming memory traffic. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> A variant called <b>recursive KV sharing</b> takes this further, it caches all tokens’ KV pairs at the first recursion and reuses them in later steps, accelerating prefill at the cost of attention granularity. Together, these strategies ensure compute is focused where needed, reducing FLOPs and memory bottlenecks. </p></td></tr><tr><td id="evaluation-and-results-of-mixtureof" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:normal;padding:0px 28px;text-align:left;"><h3 style="color:#2A2A2A;font-weight:normal;mso-line-height-alt:125.0%;">Evaluation and results of Mixture-of-Recursions</h3></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> MoR performs well in benchmarks across model sizes (135M to 1.7B parameters). At equal training compute, MoR with two recursions outperformed vanilla transformers in validation perplexity (2.75 vs. 2.78) and few-shot accuracy (43.1% vs. 42.3%), despite using 50% fewer parameters. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> When trained on fixed data, MoR matched baseline accuracy with <b>25% fewer FLOPs</b>, speeding up training by 19% and cutting peak memory by 25%. Inference throughput surged too, continuous depth-wise batching and early exiting delivered up to 2.06× speedups, with MoR-4 variants leading gains. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/21dafb58-4b0c-4462-97cf-1f277d43d847/image.png?t=1753194243" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p>Comparison of MoR, Recursive, and Vanilla Transformers</p></td></tr></table></td></tr><tr class="embed-gen-img-r"><td align="center" valign="top" style="padding:12px 12px 12px 12px;" class="dd"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top" class="o" style="padding:12px 12px 12px 12px;;background-color:#FFFFFF;border-color:#F1F1F1;border-radius:5px 5px 5px 5px;border-width:1px 1px 1px 1px;"><!--[if !mso]><!--><div style="display:none; float:left; overflow:hidden; width:0; max-height:0; line-height:0;" class="mob-show"><table role="none" border="0" cellspacing="0" cellpadding="0" align="right" width="100%"><tr><td align="center" valign="top"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.VomAAYwkCjux8i_FMc4kJQ9SKUUzydN5BZEiG_zjI54HlDSl3ozuRCi-nG7i9yVyMfy96GKm6RzcDNznPVTUlvu3q4AIofg2L6wBI-n5PeuNBeAhrTrXyNkvAE33T6oZz12ZrDcgpm7-7j2tufyA9Q/4ie/eQkcXh7ZTkONm8QCVXBBnw/h13/h001.9coERREIYXnXKnt6EbQkJLHA0VxGKQWRD5vDnhJhfvU" target="_blank"><img src="https://opengraph.githubassets.com/d1853f012e18704de9d4ef6a9a641b5687a35086051d09027be42519e408b505/raymin0223/mixture_of_recursions" width="100%" style="height:auto;display:block;"/></a></td></tr><tr><td height="16" style="font-size:16px;line-height:16px;"> </td></tr></table></div><!--<![endif]--><table role="none" border="0" cellspacing="0" cellpadding="0" align="right" width="100%"><tr><td width="57%" align="center" valign="middle" class="mob-stack"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="left" valign="middle" class="l"><p><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.VomAAYwkCjux8i_FMc4kJQ9SKUUzydN5BZEiG_zjI54HlDSl3ozuRCi-nG7i9yVyMfy96GKm6RzcDNznPVTUlqOrtT8RxxlNSrFmpmI3w4Zf54t4pbUy3pYj_FdK2JFUJokpHPsIxA230Qf07lcCMQ/4ie/eQkcXh7ZTkONm8QCVXBBnw/h14/h001.LIxKrGQaAlONDDOEzzUk0UFZ2p4NYJw1QqLzMaub998" style="text-decoration:none;font-style:normal;color:#2D2D2D !important;font-size:14px;line-height:20px;" target="_blank"> GitHub - raymin0223/mixture_of_recursions: Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation <tr><td align="left" valign="top" class="m"><p style="font-size:13px;line-height:19px;color:#2D2D2D;"> Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation - raymin0223/mixture_of_recursions </p></td></tr><tr><td align="left" valign="bottom" class="n" style="vertical-align:bottom;padding-top:12px;"><p style="word-break:break-word;">github.com/raymin0223/mixture_of_recursions</p></td></tr></a></p></td></tr></table></td><td width="3%" style="font-size:16px;line-height:16px;" class="mob-hide"> </td><td width="40%" align="left" valign="top" class="mob-hide"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.VomAAYwkCjux8i_FMc4kJQ9SKUUzydN5BZEiG_zjI54HlDSl3ozuRCi-nG7i9yVyMfy96GKm6RzcDNznPVTUlp-iKptW8rfDgeJnpvmZmgX9Rxcwe1cotyIeOMv3WzrDJpHjy8lb-WwwbRAR4WH4rw/4ie/eQkcXh7ZTkONm8QCVXBBnw/h15/h001.uW2H_xD_Yer7D_FuBA2OEvuhvdeT4RIlJZAKL0sTxKY" target="_blank"><img src="https://opengraph.githubassets.com/d1853f012e18704de9d4ef6a9a641b5687a35086051d09027be42519e408b505/raymin0223/mixture_of_recursions" width="242" style="height:auto;display:block;"/></a></td></tr></table></td></tr></table></td></tr><tr><td align="center" valign="top" style="padding-bottom:14px;padding-left:28px;padding-right:28px;padding-top:14px;text-align:center;width:100%;word-break:break-word;" class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" align="center" style="margin:14px auto 14px auto;"><tr><td align="center" valign="middle" height="44.75" style="height:44.75px;background-color:#2C81E5;border-color:#DFD150;border-radius:10px 10px 10px 10px;border-style:solid;border-width:0px 0px 0px 0px;color:#FFFFFF;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.fUNb4GdFo9D3F8WuLArtoV5sElgytBlvJRzI9WtI92Z56At2SSrqIkujaKw7hiIDQrIlEsaY1SKKiOIi1ffbe3wn-ymJte0zGfgeQQ1K13g3Z-F28LS2HZnQxzgmyFz-/4ie/eQkcXh7ZTkONm8QCVXBBnw/h16/h001.7BikgYnB0gMhNf9p16iH3jJDmu_9r0iHIsuq3EHtvXs" target="_blank" rel="noopener noreferrer nofollow" style="color:#FFFFFF;display:block;font-size:16px;font-size:16px;font-weight:normal;padding:0px 14px;padding:14px 14px 14px 14px;text-decoration:none;"> Read Full Paper </a></td></tr></table></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="#222222" style="background-color:#222222;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"></p></td></tr></table></td></tr></table></td></tr><tr><td id="how-many-instructions-can-ll-ms-fol" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">How Many Instructions Can LLMs Follow at Once?</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><i>Jaroslawicz et al. [Distyl AI]</i></p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;"> ♥ 826 </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span><span style="background-color:#e0e0e0;"><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> LLM instruction following </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span></p></td></tr><tr><td id="introduction-to-if-scale-benchmarki" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Introduction to IFScale: Benchmarking LLMs Under High Instruction Load</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Real-world LLM applications often require models to follow dozens or even hundreds of instructions simultaneously, whether generating reports with strict formatting rules, adhering to compliance standards, or executing complex workflows. But how do today's models perform when pushed to their cognitive limits? </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Existing benchmarks only test models with single or few instructions, which leaves a critical gap in understanding performance under high instruction density. The IFScale benchmark tackles this by measuring how LLMs handle increasing instruction loads in a controlled business report writing task. </p></td></tr><tr><td id="how-if-scale-tests-instruction-foll" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">How IFScale Tests Instruction-Following Capabilities</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> The benchmark uses a straightforward approach: models must generate a professional business report while including up to 500 specific keywords, with each keyword acting as a distinct instruction. The vocabulary comes from real SEC filings, processed through multiple filtering steps to ensure relevance and uniqueness. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> For each test run, the model receives a prompt listing N instructions like "Include the exact word {keyword}" alongside the report-writing task. Performance is automatically evaluated by checking keyword inclusion through pattern matching. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/86594db5-68ff-4995-8bd7-e251912cfa41/image.png?t=1753194536" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p>Model performance degradation as instruction density increases from 10 to 500 instructions.</p></td></tr></table></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> The design cleverly isolates instruction-following capability from other factors. Keywords are single English words to minimize linguistic complexity, and the business report context mirrors real-world scenarios. The benchmark scales instruction density from 10 to 500 in increments of 10, revealing how performance degrades as cognitive load increases. Error types are categorized as omissions (missing keywords) or modifications (using variants like "strategic" instead of "strategy"). </p></td></tr><tr><td id="key-findings-and-practical-implicat" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Key Findings and Practical Implications</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Even top-tier LLMs like Gemini 2.5 Pro achieve only 68% accuracy at 500 instructions. After analysis, the researchers noted three distinct degradation patterns: </p></td></tr><tr><td style="padding-bottom:12px;padding-left:50px;padding-right:40px;padding-top:12px;" class="ee"><div style="margin-left:0px;" class="edm_outlooklist"><ul style="font-weight:normal;list-style-type:disc;margin-bottom:12px !important;margin-top:12px !important;padding:0px 0px 0px 0px;"><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><b>Threshold decay</b>: Reasoning models maintain near-perfect performance until a critical point (around 150 instructions) before declining </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><b>Linear decay</b>: Steady performance drop across all densities </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"><b>Exponential decay</b>: Rapid collapse after minimal instruction loads </p></li></ul></div></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Notably, models universally show a "primacy effect", favoring earlier instructions, that peaks around 150-200 instructions before fading. At extreme densities, models abandon selective attention and fail uniformly. Error analysis shows omissions become 30x more common than modifications in weaker models. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Additionally, reasoning models like o3 take 10x longer to process high-density prompts compared to faster models like Claude 3.5 Haiku. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/232de5d6-ab93-4323-85e7-1b1e9cf50f0a/image.png?t=1753194637" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr></table></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Based on these results, we can conclude that: </p></td></tr><tr><td style="padding-bottom:12px;padding-left:50px;padding-right:40px;padding-top:12px;" class="ee"><div style="margin-left:0px;" class="edm_outlooklist"><ol start="1" style="list-style-type:decimal;margin:0px 0px;padding:0px 0px 0px 0px;"><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"> Prioritize reasoning-capable models for high-instruction tasks </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"> Place critical instructions early in prompts </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"> Balance accuracy needs against latency constraints </p></li><li class="listItem ultext"><p style="mso-line-height-alt:150.0%;padding:0px;text-align:left;word-break:break-word;"> Avoid exceeding 150 instructions for time-sensitive applications </p></li></ol></div></td></tr><tr class="embed-gen-text"><td align="center" valign="top" style="padding:12px 12px 12px 12px;" class="dd"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top" class="o" style="padding:12px 12px 12px 12px;;background-color:#FFFFFF;border-color:#F1F1F1;border-radius:5px 5px 5px 5px;border-width:1px 1px 1px 1px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="left" valign="top" class="l"><p><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.9ggl6Mt0xphuuMReR5gVpWQFxRTrGSSZnQdp915xHRxWu9ato3DWRnnkdzkV3-dqrLbq6ekEI1F7nlWVh8zTkGMb8QVEA-zK-ADWqHWwgP1gL_2qulhAa_luRZqmDBIR/4ie/eQkcXh7ZTkONm8QCVXBBnw/h17/h001.Ji9lFzBYspgfMAQr4jgwIDc1HJ3q-ZqnY2rGothGh7o" style="text-decoration:none;font-style:normal;color:#2D2D2D !important;font-size:14px;line-height:20px;" target="_blank"> View the full benchmark and results of IFScale (Instruction Following at Scale) <tr><td align="left" valign="bottom" class="n" style="vertical-align:bottom;padding-top:12px;"><p style="word-break:break-word;">distylai.github.io/IFScale</p></td></tr></a></p></td></tr></table></td></tr></table></td></tr><tr><td align="center" valign="top" style="padding-bottom:14px;padding-left:28px;padding-right:28px;padding-top:14px;text-align:center;width:100%;word-break:break-word;" class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" align="center" style="margin:14px auto 14px auto;"><tr><td align="center" valign="middle" height="44.75" style="height:44.75px;background-color:#2C81E5;border-color:#DFD150;border-radius:10px 10px 10px 10px;border-style:solid;border-width:0px 0px 0px 0px;color:#FFFFFF;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.fUNb4GdFo9D3F8WuLArtoV5sElgytBlvJRzI9WtI92aA0l8_Xy1-0Lm8spAPdy9gq0Bi2tw3yLeG47YskVMMsC8vdjYXEScSgx4IoiJJSdO7v_O9BgTnbDln9LYVAjRc/4ie/eQkcXh7ZTkONm8QCVXBBnw/h18/h001.RVZ38NTl5wqOTo1iHw3XrSDq1FVwZfwhfgzOD4Dr6io" target="_blank" rel="noopener noreferrer nofollow" style="color:#FFFFFF;display:block;font-size:16px;font-size:16px;font-weight:normal;padding:0px 14px;padding:14px 14px 14px 14px;text-decoration:none;"> Read Full Paper </a></td></tr></table></td></tr><tr><td><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" style=""><tr><td bgcolor="#222222" style="background-color:#222222;padding:0.0px 0.0px 0.0px 0.0px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0"><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"></p></td></tr></table></td></tr></table></td></tr><tr><td id="chain-of-thought-monitorability-a-n" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><i>Korbak et al. [UK AI Security Institute, Apollo Research]</i></p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"><span style="background-color:#e0e0e0;"><span style="color:rgb(255, 58, 58);font-size:0.6rem;"> ♥ 424 </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span><span style="background-color:#e0e0e0;"><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> LLM Training </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span><span style="background-color:#e0e0e0;"><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> bycloud’s pick </span></span><span style="color:rgb(44, 129, 229);font-size:0.6rem;"> </span></p></td></tr><tr><td id="introduction-to-chain-of-thought-mo" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Introduction to Chain of Thought Monitoring</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Advanced AI systems often operate like black boxes, making it challenging to detect harmful intentions before they result in dangerous actions. But there's a unique opportunity emerging with models that "think" in human language: by monitoring their chains of thought (CoT), we might catch misbehavior early. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> This approach isn't perfect, some harmful reasoning could still go undetected, but it offers a promising safety layer alongside existing methods. The key insight here is that when AI systems perform complex tasks, they often need to externalize their reasoning through natural language. This creates a window where we can observe their intentions, though this window might close if we're not careful about how we develop future models. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:20px;padding-left:28px;padding-right:28px;padding-top:20px; " class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" style="margin:0 auto 0 auto;"><tr><td align="center" valign="top" style="width:656px;"><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/8faa05ed-b596-46a9-8834-cfdf9a05b3d3/image.png?t=1753194753" alt="" height="auto" width="656" style="display:block;width:100%;" border="0"/></td></tr><tr><td align="center" valign="top" class="t" style="width:656px; padding: 4px 0px 4px 0px;"><p>Long serial chains of cognition must pass through the chain of thought.</p></td></tr></table></td></tr><tr><td id="inner-workings-of-chain-of-thought-" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Inner Workings of Chain of Thought Monitoring</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> For reasoning tasks requiring multiple cognitive steps, models must use their chain of thought as working memory. In transformer architectures, information flows through layers in a way that forces complex reasoning to surface in the CoT. This happens because serial reasoning steps, like solving a math problem or planning a sequence of actions, need to pass intermediate results through the generated text tokens. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Essentially, the CoT acts as a bottleneck for intricate thought processes. Models trained this way tend to follow a "natural language prior," meaning their CoT resembles human reasoning patterns rather than opaque machine code. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> However, this monitorability faces challenges. During reinforcement learning phases, models optimize for correct outcomes, not transparent reasoning. If we add direct supervision to make CoT "look safer" or more concise, it might distort how faithfully the CoT reflects the model's true decision process. Even subtle pressures, like reward models favoring certain outputs, could indirectly encourage models to hide concerning reasoning steps. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Architectural changes pose another risk. New designs using latent reasoning, where computation happens in continuous vectors instead of text tokens, could eliminate the need for externalized CoT entirely. Without natural language traces, we'd lose this monitoring avenue. That’s why understanding what keeps CoT legible and necessary is critical as models evolve. </p></td></tr><tr><td id="evaluation-and-future-of-chain-of-t" class="dd" align="left" valign="top" style="color:#2A2A2A;font-weight:Bold;padding:0px 28px;text-align:left;"><h2 style="color:#2A2A2A;font-weight:Bold;mso-line-height-alt:150.0%;">Evaluation and Future of Chain of Thought Monitoring</h2></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Current evaluations show CoT monitoring can detect explicit misbehavior, like models writing "Let's hack this system" during planning phases. It also helps uncover hidden goals; for instance, a model might claim helpful intentions in its final output while revealing contradictory motives in its CoT. However, effectiveness varies. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> In simple tasks, models might skip CoT or hide reasoning, which reduces transparency. For complex tasks like cyberattack planning, where CoT is necessary, detection rates improve significantly. </p></td></tr><tr><td class="dd" align="left" style="padding:0px 28px;text-align:left;word-break:break-word;"><p style="mso-line-height-alt:150.0%;"> Future research should focus on CoT readability (can humans or other models understand it?), latent reasoning capabilities (how well can models think without CoT?), and causal relevance (does altering the CoT change outputs?). Developers must track these metrics during training and publish results in system cards. If monitorability degrades, say, due to architectural shifts, teams might revert to earlier checkpoints or balance safety tradeoffs differently. </p></td></tr><tr><td align="center" valign="top" style="padding-bottom:14px;padding-left:28px;padding-right:28px;padding-top:14px;text-align:center;width:100%;word-break:break-word;" class="dd"><table role="none" border="0" cellspacing="0" cellpadding="0" align="center" style="margin:14px auto 14px auto;"><tr><td align="center" valign="middle" height="44.75" style="height:44.75px;background-color:#2C81E5;border-color:#DFD150;border-radius:10px 10px 10px 10px;border-style:solid;border-width:0px 0px 0px 0px;color:#FFFFFF;font-family:'Open Sans','Segoe UI','Apple SD Gothic Neo','Lucida Grande','Lucida Sans Unicode',sans-serif;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.fUNb4GdFo9D3F8WuLArtoV5sElgytBlvJRzI9WtI92bWw-7aIzZRtkLVaP6uqODQ0M3stW6TomBWRCZmE44-v5miPEdJQC0wT-ua1vtxRx-XpSL_dFo_0AWgQOnMwYQn/4ie/eQkcXh7ZTkONm8QCVXBBnw/h19/h001.ovaqmWysyfvn3_RvrdJN8uuGGO8WbGTz7EFYrlNE0-s" target="_blank" rel="noopener noreferrer nofollow" style="color:#FFFFFF;display:block;font-size:16px;font-size:16px;font-weight:normal;padding:0px 14px;padding:14px 14px 14px 14px;text-decoration:none;"> Read Full Paper </a></td></tr></table></td></tr><tr><td class="dd" style="padding: 20px;"><table width="100%" cellpadding="0" cellspacing="0" role="none" style="max-width:520px;margin:0 auto;"><tr><td class="q" style="padding:16px 16px 6px 16px;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.tLfGW26lAwaS9gFg17HSoDDFT6eh5Nsg0xYVQj-h6I3o9m2k79_qw4izMYhmcI36u5CBXyW1VQWoVRkErC-OShBQ_cmmWVeT9_Qd-93zEwC0HMwyRHrm6h21ApLsMigr2wbEwUCbXm8uCimvPZxECghesVZ3XRHdItwyBvh5Jyo/4ie/eQkcXh7ZTkONm8QCVXBBnw/h20/h001.cdwAL3v_oSxfu4mXxM_QtTgxAthlR9VWw7dXeimM42I" style="text-decoration:none !important;"><table width="100%" cellpadding="0" cellspacing="0" border="0" role="none"><tr><td width="100%" style="padding: 0 0 14px 0;text-decoration:none;width:100%;"><table width="100%" cellpadding="0" cellspacing="0" border="0" role="none"><tr><td width="36" style="width:36px;"><img src="https://pbs.twimg.com/profile_images/1698572487909400576/BvncwnrP_normal.jpg" alt="tw profile: The AI Timeline" style="display:block;width:36px;height:36px;border-radius:50%;border:0;"/></td><td width="400" style="padding:0 0 0 8px;text-decoration:none;"><span style="display:block;font-size:14px;color:#1c2022;font-weight:700;"> The AI Timeline </span><span style="display:block;color:#697882;font-size:14px;"> @TheAITimeline </span></td><td width="24" align="right" style="vertical-align:text-top;"><img width="24" height="24" loading="lazy" alt="tw" style="border:0;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/static_assets/x_logo.png"/></td></tr></table></td></tr><tr></tr><tr><td style="word-break:break-word;"><p>🚨This week's top AI/ML research papers:</p><p>- Mixture-of-Recursions <br>- Scaling Laws for Optimal Data Mixtures <br>- Training Transformers with Enforced Lipschitz Constants <br>- Reasoning or Memorization? <br>- How Many Instructions Can LLMs Follow at Once? <br>- Chain of Thought Monitorability <br>-</p></td></tr><tr><td style="padding:12px 0 0 0;"></td></tr><tr><td align="center" style="padding:8px 0 0 0;width:480px;"><img src="https://pbs.twimg.com/media/GwU4aviWYAAtm0G.png" width="480" height="auto" style="display:block;border:1px solid #E1E8ED;border-radius:5px;width:100%;max-width:480px;height:auto;"/></td></tr><tr><td height="8" style="line-height:1px;font-size:1px;height:8px;"> </td></tr><tr><td align="left" valign="top" class="s"><p>8:06 PM • Jul 20, 2025</p></td></tr><tr><td height="10" style="line-height: 1px; font-size: 1px; height: 10px;"> </td></tr><tr><td height="1" bgcolor="#e1e8ed" style="line-height:0px;font-size:0px;height:1px;"></td></tr><tr><td height="10" style="line-height:1px;font-size:1px;height:10px;"> </td></tr><tr><td align="left" valign="top" class="s"><p><b style="color:#1C2022">476</b> Likes <b style="color:#1C2022">70</b> Retweets </p></td></tr><tr><td align="left" valign="top" class="s"><div align="center" style="text-align:center;margin-top:4px;margin-bottom:4px;padding:8px;border:1px solid #ccd6dd;border-radius:9999px;color:#1B95E0"><b>4 Replies</b></div></td></tr></table></a></td></tr></table></td></tr></table></td></tr></table></td></tr><tr><td align="center" valign="top"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td><tr><td class="b" align="center" valign="top" bgcolor="#2a2a2a" style="padding:0px 0px 0px 0px;border-style:solid;border-width: 0px 0px 0px 0px;border-color: #2a2a2a;border-bottom-left-radius:10px;border-bottom-right-radius:10px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top" bgcolor="#73ddff" style="padding:12px"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td><span style="padding-left:1px;"></span></td><td align="center" valign="middle" width="75" style="width:75px;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.1muhFWIqieRYpaJ-FbWSCQqcWoV4NNHHr5SkP9THApWuHAAlWLQxI3Q_IqFmt_DcyAxeC8jDApCnHmMSBGpBb5sgtimvBYgxRX-Rp7s0F3LjCHoSwdhr83OBqRFhJ1y_/4ie/eQkcXh7ZTkONm8QCVXBBnw/h21/h001.c78Qo6Mr7DhbhjRTzVMHR8TqjCaEw5PoNZdfQ0l8440" style="text-decoration:none;"><img width="22" height="22" alt="tw" border="0" style="display:block;max-width:22px;color:Dark" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/static_assets/x_dark.png"/></a></td><td align="center" valign="middle" width="75" style="width:75px;"><a href="https://elink4f7.mail.bycloud.ai/ss/c/u001.amatuKKICSickUKplYJXmBoQnQ9VXnB2zTxBG4HeHBgjMqVxpoXRdj01cjwyoVlHgiebEOgBvwHtevoVpsSvpn3Q1di2ml6sb3cBM-X6IStQbj_zQSVGWJ8AAmPw2en2/4ie/eQkcXh7ZTkONm8QCVXBBnw/h22/h001.UpfLolhTAsfGjVvX9c6dRw5KNbRi4Nva7fGQaLC987o" style="text-decoration:none;"><img width="22" height="16" alt="yt" border="0" style="display:block;max-width:22px;color:Dark" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/static_assets/youtube_dark.png"/></a></td><td><span style="padding-left:1px;"></span></td></tr></table></td></tr><tr><td height="10" style="line-height:1px;font-size:1px;height:10px;"> </td></tr><tr><td class="w" align="center" valign="top" style="padding:15px 15px 15px 15px;"><table role="none" width="100%" border="0" cellspacing="0" cellpadding="0" align="center"><tr><td align="center" valign="top"><p style="font-family:'Verdana',Geneva,sans-serif;color:#FFFFFF!important;"> Update your email preferences or unsubscribe <a class="link" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.c6q0w4g5sodbtO4I1B_pxWc4htTObwdorovK0nFHVH-4pUdVE0ELYH5DsNemk732SjNwhPNJ25r0O8B5vYifsBhEpz-DJgyVFmavJPa0OyKRRnvw4o7XGyvIv7PRofnmwLNZECj-fmoYg0YiTvkIYu61L7zum3Weu0Fepc2AXUyEnmYV9c_mQ_tUH25hQXcfXl6M9YN0RcYSeB9ncPK0Tz68WmOXg072ZcULNdacmWtEJs2Ok6HUdwor0gEud2s3VtiXShpHgwL60DbFuPwFIv7tN2qaVPhWf1mbcRrd4v_HVAv6FVw1KHQZERgcn8uVLfz8WhU60-2qO-OfgmFBM-hRHZKsXswUOrbBphbmFFT_axjQqrHtYSJLRhXhhplyK4_ju4ACTw2uQIwU3HwPI8p-7RVksX0pvtHO7ZkXIzjbcCeWTyDsjTFGqLbgkJTiqTzLuXeecKLVbQXmE3YfbukfcmrRr_i4yF6dIIEkADRInS89w8T5YT94sbr6P9MrA-BGi5oBD-I4gWvqyfWNJNb8k5ECfnBIcsfV1CDaXyPez2VkmXwfvTtS6pxDgepiy81nZhKLcGjH4NCeAjrNBMUxK1kXMwnLKTTV8zc5ARJ8Gcx2ylkMLftDxdXEs1gkzv1LEzoG_-vTohQvO14XO8pcbNlqWnDuC9A4kq5iu9Xo7J4Y9O1UJoZmMI0xLZ6WgL_a0NyG7JI1DTJmX8UWMIeVL1AhEnOX4ha0OONOg9NAmM656P-unrPGK9C5TD9LKPUGoOv6rYLo0_sZRylKScqYKrnkr6mVt_apqKRFXRUuxoOLhmdbmbImYQJjMZjtknRPe4Uukl8ZfJHesylZhw/4ie/eQkcXh7ZTkONm8QCVXBBnw/h23/h001.2ANNoke_NF3fdFjEc4SZb5ayVSFdwLm0jawu4Y52Y1o" style="text-decoration:underline;text-decoration-color:#FFFFFF!important;color:#FFFFFF!important;"> here</a></p><p class="copyright" style="font-family:'Verdana',Geneva,sans-serif;color:#FFFFFF!important;"> © 2025 bycloudai </p><p style="font-family:'Verdana',Geneva,sans-serif;color:#FFFFFF!important;"> 228 Park Ave S, #29976, New York, New York 10003, United States </p></td></tr><tr style="display: table-row !important;"><td align="center" valign="top" style="padding-top:20px;" style="display:table-cell !important;"><table role="none" border="0" cellspacing="0" cellpadding="0" align="center" style="display:table !important;"><tr style="display:table-row !important;"><td class="u" align="center" valign="middle" height="32" style="height:32px;display:table-cell !important; max-height: 32px !important;margin:0px !important; background-color: #ffffff !important;"><a style="line-height:32px !important;text-decoration:none;display:block !important;" href="https://elink4f7.mail.bycloud.ai/ss/c/u001.DUiN96-Eq7pUHzwEhy5j28olDWFpV5DDKfdk_OdOKOiKWNKy3Y4QfAaI_-TupNwZz-FV3RZ7iKTnYz1i_mpBpamFBoEysR6JIPYjQqrKkZvWDhzYkGHPI_qK4Kvomt0VCAp-BDDupBv3RlGafvd65bfXvqOurz93Pi3VqGFSZcRiDGgIbhM38Rd6xacaCjoT9hACsJyawR3qFqLzHX0sNB4AJhgC05L3ZGiRKg_9bjbMo_Vq8b9WEdFdA80nShOp/4ie/eQkcXh7ZTkONm8QCVXBBnw/h24/h001.SjAULnFlfsY8uDr_R6wuhs-TJtZ1t21EDzaFA7dUY0o"><img src="https://media.beehiiv.com/output-onlinepngtools.png" width="16" alt="beehiiv logo" style="display:inline-block !important;max-width:16px !important; vertical-align:-3px !important;width: 16px !important;" border="0"/><span style="padding-left:11px !important;display: inline-block !important;">Powered by beehiiv</span></a></td></tr></table></td></tr><tr><td align="left" valign="top" height="2" style="height:2px;"><a href='https://elink4f7.mail.bycloud.ai/ss/c/u001.CxDkkVpJsBdVoe83c_tBWsHIaP4XNp0WgUYqLvHcKk_3uqk_KIkz4ddLinhFbud6JuxLFdSUhYnR7b1NSsmbtzXNGNblnEEMKUtkCAjkn8Y/4ie/eQkcXh7ZTkONm8QCVXBBnw/h25/h001.jt9B0N_GqQxCNkmqx3pk6RZAliAmdpJ-vxcpQAP24Gg' style="color: #2a2a2a !important; cursor: default; font-size: 1px; text-decoration: none;"> Terms of Service </a></td></tr></table></td></tr></table></td></tr></td></tr></table></td></tr></table></td></tr></table></td></tr></table></div></body></html>