Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 2.3k
Star 21.3k

Code
Issues 917
Pull requests 91
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: Dao-AILab/flash-attention

Labels 9 Milestones 0

New pull request New

91 Open 360 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Cute] Fix: arg pass in cute flash-attn inferface

#2101 opened Dec 27, 2025 by SeanLi-OI

Loading…

[WIP,Cute,Fwd] improved block sparsity

#2100 opened Dec 26, 2025 by reubenconducts • Draft

Add pack-gqa fwd support for sparse impl w/ broadcasted H dim

#2098 opened Dec 24, 2025 by drisspg

Loading…

Fix softmax incorrect row_max issue

#2083 opened Dec 17, 2025 by imbr92

Loading…

Fix TypeError when ColumnParallelLinear is None

#2080 opened Dec 17, 2025 by ailuntz

Loading…

Reduce Chance of Build OOM

#2079 opened Dec 17, 2025 by Qubitium

Loading…

[Cute,Fwd,SM100] Add softmax precision control parameters: rescale_threshold and disable_e2e

#2067 opened Dec 15, 2025 by ssxuwinter

Loading…

Add missing code highlighting to the README

#2061 opened Dec 10, 2025 by bryant1410

Loading…

Update README.md

#2058 opened Dec 10, 2025 by eduardoruiz1999

Loading…

[AMD ROCm] Enable CK backend for ROCm gfx12

#2054 opened Dec 8, 2025 by hyoon1

Loading…

Fix Windows Linking for FlashAttention 3 using Ninja response files

#2047 opened Dec 4, 2025 by windreamer

Loading…

Ko3n1g/ci/torch2.9 for cuda129

#2044 opened Dec 3, 2025 by ko3n1g • Draft

Disable abi3 for free-threaded python

#2034 opened Nov 25, 2025 by kevmo314

Loading…

[Cute,Fwd/Bwd,Sm12x] [WIP/DRAFT/HELP] cute FA for sm12x

#2017 opened Nov 16, 2025 by johnnynunez • Draft

[Cute,Fwd,Sm100] Support q_stage=1 for inference

#1993 opened Nov 7, 2025 by timmy-feng

Loading…

[Cute,Fwd,Sm90] Support KV cache

#1992 opened Nov 6, 2025 by imbr92

Loading…

Local window attention deterministic mode efficiency improvement

#1972 opened Oct 29, 2025 by GD06 • Draft

Make wheel name and version consistent

#1956 opened Oct 22, 2025 by bobingm

Loading…

fix: nan when m_i_new=-inf in online softmax

#1948 opened Oct 20, 2025 by tongyx361

Loading…

support cpu run fa triton kernel

#1938 opened Oct 15, 2025 by hellozmz • Draft

windows error error C2039:

#1932 opened Oct 12, 2025 by Granddyser

Loading…

feat: add to support float8 kvcache in fa4

#1914 opened Sep 28, 2025 by yicwang

Loading…

fix forward and backward kernel

#1907 opened Sep 24, 2025 by rz2778

Loading…

Add flash_attn_varlen_qkvpacked_func to hopper (flash_attn_3)

#1902 opened Sep 22, 2025 by foreverYoungGitHub

Loading…

Feature/varlen rotray

#1899 opened Sep 19, 2025 by mhoangvslev

Loading…

Previous 1 2 3 4 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!