Commit Graph

550 Commits

Author SHA1 Message Date
Pzqqt
2bc524a19b Revert "f2fs: avoid to check PG_error flag"
[Suggestion from Tashar02](375754065c (commitcomment-114679849))

This reverts commit 375754065cdb21304bec51240d2fcb03246d4c79.
2023-08-09 18:23:15 -05:00
Chao Yu
3dfdcb791c f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info
This patch tries to use bitfield in struct f2fs_io_info to improve
memory usage.

struct f2fs_io_info {
...
	unsigned int need_lock:8;	/* indicate we need to lock cp_rwsem */
	unsigned int version:8;		/* version of the node */
	unsigned int submitted:1;	/* indicate IO submission */
	unsigned int in_list:1;		/* indicate fio is in io_list */
	unsigned int is_por:1;		/* indicate IO is from recovery or not */
	unsigned int retry:1;		/* need to reallocate block address */
	unsigned int encrypted:1;	/* indicate file is encrypted */
	unsigned int post_read:1;	/* require post read */
...
};

After this patch, size of struct f2fs_io_info reduces from 136 to 120.

[Nathan: fix a compile warning (single-bit-bitfield-constant-conversion)]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
2023-08-09 18:01:52 -05:00
Yangtao Li
9267967912 f2fs: merge f2fs_show_injection_info() into time_to_inject()
There is no need to additionally use f2fs_show_injection_info()
to output information. Concatenate time_to_inject() and
__time_to_inject() via a macro. In the new __time_to_inject()
function, pass in the caller function name and parent function.

In this way, we no longer need the f2fs_show_injection_info() function,
and let's remove it.

Suggested-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 18:01:43 -05:00
Chao Yu
b9d5d11d79 f2fs: avoid to check PG_error flag
After below changes:
commit 14db0b3c7b83 ("fscrypt: stop using PG_error to track error status")
commit 98dc08bae678 ("fsverity: stop using PG_error to track error status")

There is no place in f2fs we will set PG_error flag in page, let's remove
other PG_error usage in f2fs, as a step towards freeing the PG_error flag
for other uses.

Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 18:01:40 -05:00
Jaegeuk Kim
4388f9b533 f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.

Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
 - It records total data blocks allocated since mount;
 - When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
 - Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.

Test and result:
 - Prepare: create about 30000 files
  * 3% for cold files (with cold file extension like .apk, from 3M to 10M)
  * 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
  * 47% for hot files (with hot file extension like .db, from 1K to 256K)
 - create(5%)/random update(90%)/delete(5%) the files
  * total write amount is about 70G
  * fsync will be called for .db files, and buffered write will be used
for other files

The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.

Benefit: dirty segment count increment reduce about 14%
 - before: Dirty +21110
 - after:  Dirty +18286

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 17:47:10 -05:00
Jaegeuk Kim
82ad3f41df f2fs: refactor extent_cache to support for read and more
This patch prepares extent_cache to be ready for addition.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 17:47:09 -05:00
Jaegeuk Kim
c737b8d17b f2fs: specify extent cache for read explicitly
Let's descrbie it's read extent cache.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 17:47:08 -05:00
Jaegeuk Kim
48a3e018a8 f2fs: allow to read node block after shutdown
If block address is still alive, we should give a valid node block even after
shutdown. Otherwise, we can see zero data when reading out a file.

Cc: stable@vger.kernel.org
Fixes: 83a3bfdb5a8 ("f2fs: indicate shutdown f2fs to allow unmount successfully")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-09 17:46:56 -05:00
Chao Yu
db27abb058 f2fs: support recording errors into superblock
This patch supports to record detail reason of FSCORRUPTED error into
f2fs_super_block.s_errors[].

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Fiqri Ardyansyah <fiqri0927936@gmail.com>
2022-11-03 11:48:06 +02:00
Zhang Qilong
43f0d91c43 f2fs: code clean and fix a type error
ERROR: code indent should use tabs where possible
ERROR: spaces required around that ':'
ERROR: incorrect tab

Found serveral code type errors when review the code and fix it.
There is no function change.

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Fiqri Ardyansyah <fiqri0927936@gmail.com>
2022-11-03 11:48:06 +02:00
Shuqi Zhang
487217d4f8 f2fs: fix wrong dirty page count when race between mmap and fallocate.
This is a BUG_ON issue as follows when running xfstest-generic-503:
WARNING: CPU: 21 PID: 1385 at fs/f2fs/inode.c:762 f2fs_evict_inode+0x847/0xaa0
Modules linked in:
CPU: 21 PID: 1385 Comm: umount Not tainted 5.19.0-rc5+ #73
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014

Call Trace:
evict+0x129/0x2d0
dispose_list+0x4f/0xb0
evict_inodes+0x204/0x230
generic_shutdown_super+0x5b/0x1e0
kill_block_super+0x29/0x80
kill_f2fs_super+0xe6/0x140
deactivate_locked_super+0x44/0xc0
deactivate_super+0x79/0x90
cleanup_mnt+0x114/0x1a0
__cleanup_mnt+0x16/0x20
task_work_run+0x98/0x100
exit_to_user_mode_prepare+0x3d0/0x3e0
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x42/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Function flow analysis when BUG occurs:
f2fs_fallocate                    mmap
                                  do_page_fault
                                    pte_spinlock  // ---lock_pte
                                    do_wp_page
                                      wp_page_shared
                                        pte_unmap_unlock   // unlock_pte
                                          do_page_mkwrite
                                          f2fs_vm_page_mkwrite
                                            down_read(invalidate_lock)
                                            lock_page
                                            if (PageMappedToDisk(page))
                                              goto out;
                                            // set_page_dirty  --NOT RUN
                                            out: up_read(invalidate_lock);
                                        finish_mkwrite_fault // unlock_pte
f2fs_collapse_range
  down_write(i_mmap_sem)
  truncate_pagecache
    unmap_mapping_pages
      i_mmap_lock_write // down_write(i_mmap_rwsem)
        ......
        zap_pte_range
          pte_offset_map_lock // ---lock_pte
           set_page_dirty
            f2fs_dirty_data_folio
              if (!folio_test_dirty(folio)) {
                                        fault_dirty_shared_page
                                          set_page_dirty
                                            f2fs_dirty_data_folio
                                              if (!folio_test_dirty(folio)) {
                                                filemap_dirty_folio
                                                f2fs_update_dirty_folio // ++
                                              }
                                            unlock_page
                filemap_dirty_folio
                f2fs_update_dirty_folio // page count++
              }
          pte_unmap_unlock  // --unlock_pte
      i_mmap_unlock_write  // up_write(i_mmap_rwsem)
  truncate_inode_pages
  up_write(i_mmap_sem)

When race happens between mmap-do_page_fault-wp_page_shared and
fallocate-truncate_pagecache-zap_pte_range, the zap_pte_range calls
function set_page_dirty without page lock. Besides, though
truncate_pagecache has immap and pte lock, wp_page_shared calls
fault_dirty_shared_page without any. In this case, two threads race
in f2fs_dirty_data_folio function. Page is set to dirty only ONCE,
but the count is added TWICE by calling filemap_dirty_folio.
Thus the count of dirty page cannot accord with the real dirty pages.

Following is the solution to in case of race happens without any lock.
Since folio_test_set_dirty in filemap_dirty_folio is atomic, judge return
value will not be at risk of race.

Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Fiqri Ardyansyah <fiqri0927936@gmail.com>
2022-11-03 11:48:05 +02:00
Mel Gorman
2e73c18044 mm, pagevec: remove cold parameter for pagevecs
Every pagevec_init user claims the pages being released are hot even in
cases where it is unlikely the pages are hot.  As no one cares about the
hotness of pages being released to the allocator, just ditch the
parameter.

No performance impact is expected as the overhead is marginal.  The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2022-09-22 19:09:10 +03:00
Andrzej Perczak
a2cea6a252 fs: f2fs: Remove android tracing and align with upstream
For easier upstream merges.

Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2022-09-22 19:08:19 +03:00
Andrzej Perczak
8efe2afe0f Merge remote-tracking branch 'f2fs/linux-4.14.y' into msm-4.14-base
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2022-09-22 19:08:12 +03:00
pwnrazr
e628b85dab Merge remote-tracking branch 'jaegeuk-f2fs/linux-4.14.y' 2022-06-27 14:29:45 +00:00
pwnrazr
1426f17bc3 Revert f2fs stuff because they force pushed too much
Revert "Merge branch 'jaegeuk-f2fs' into dev-pwn"

This reverts commit 8e8e81eea6a74740fab2f21a3056b0ee040ab16e, reversing
changes made to aeb3f7d0f73fb5131ac2aa298f84453213d012d4.

Revert "Merge branch 'jaegeuk-f2fs' into dev-pwn"

This reverts commit 3053c18d881d96afad797ec29b493e8ac5f3f198, reversing
changes made to cbd6d2445197d6aa7cff37b74b04c99d521940f4.
2022-06-27 14:29:25 +00:00
kondors1995
5dc690a12f f2fs:remove cold parameter for pagevecs 2022-05-19 09:31:00 +00:00
kondors1995
c19de259b9 Merge pwnrazr/jaegeuk-f2fs 2022-05-19 08:50:04 +00:00
kondors1995
1dfbe7754f Revert "Merge branch 'linux-4.14.y' of https://github.com/jaegeuk/f2fs-stable into dev/12.0"
This reverts commit 148d1debc3, reversing
changes made to 1997320ffd.
2022-05-02 07:08:57 +00:00
Mel Gorman
7698434169 mm, pagevec: remove cold parameter for pagevecs
Every pagevec_init user claims the pages being released are hot even in
cases where it is unlikely the pages are hot.  As no one cares about the
hotness of pages being released to the allocator, just ditch the
parameter.

No performance impact is expected as the overhead is marginal.  The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2022-04-03 15:15:45 +00:00
kondors1995
148d1debc3 Merge branch 'linux-4.14.y' of https://github.com/jaegeuk/f2fs-stable into dev/12.0 2022-04-03 15:15:10 +00:00
kondors1995
1997320ffd checkotu f2fs to 6aceb2279b 2022-04-03 15:13:50 +00:00
Jaegeuk Kim
e6ce6c6536 f2fs: avoid infinite loop to flush node pages
xfstests/generic/475 can give EIO all the time which give an infinite loop
to flush node page like below. Let's avoid it.

[16418.518551] Call Trace:
[16418.518553]  ? dm_submit_bio+0x48/0x400
[16418.518574]  ? submit_bio_checks+0x1ac/0x5a0
[16418.525207]  __submit_bio+0x1a9/0x230
[16418.525210]  ? kmem_cache_alloc+0x29e/0x3c0
[16418.525223]  submit_bio_noacct+0xa8/0x2b0
[16418.525226]  submit_bio+0x4d/0x130
[16418.525238]  __submit_bio+0x49/0x310 [f2fs]
[16418.525339]  ? bio_add_page+0x6a/0x90
[16418.525344]  f2fs_submit_page_bio+0x134/0x1f0 [f2fs]
[16418.525365]  read_node_page+0x125/0x1b0 [f2fs]
[16418.525388]  __get_node_page.part.0+0x58/0x3f0 [f2fs]
[16418.525409]  __get_node_page+0x2f/0x60 [f2fs]
[16418.525431]  f2fs_get_dnode_of_data+0x423/0x860 [f2fs]
[16418.525452]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[16418.525458]  ? __mod_memcg_state.part.0+0x2a/0x30
[16418.525465]  ? __mod_memcg_lruvec_state+0x27/0x40
[16418.525467]  ? __xa_set_mark+0x57/0x70
[16418.525472]  f2fs_do_write_data_page+0x10e/0x7b0 [f2fs]
[16418.525493]  f2fs_write_single_data_page+0x555/0x830 [f2fs]
[16418.525514]  ? sysvec_apic_timer_interrupt+0x4e/0x90
[16418.525518]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[16418.525523]  f2fs_write_cache_pages+0x303/0x880 [f2fs]
[16418.525545]  ? blk_flush_plug_list+0x47/0x100
[16418.525548]  f2fs_write_data_pages+0xfd/0x320 [f2fs]
[16418.525569]  do_writepages+0xd5/0x210
[16418.525648]  filemap_fdatawrite_wbc+0x7d/0xc0
[16418.525655]  filemap_fdatawrite+0x50/0x70
[16418.525658]  f2fs_sync_dirty_inodes+0xa4/0x230 [f2fs]
[16418.525679]  f2fs_write_checkpoint+0x16d/0x1720 [f2fs]
[16418.525699]  ? ttwu_do_wakeup+0x1c/0x160
[16418.525709]  ? ttwu_do_activate+0x6d/0xd0
[16418.525711]  ? __wait_for_common+0x11d/0x150
[16418.525715]  kill_f2fs_super+0xca/0x100 [f2fs]
[16418.525733]  deactivate_locked_super+0x3b/0xb0
[16418.525739]  deactivate_super+0x40/0x50
[16418.525741]  cleanup_mnt+0x139/0x190
[16418.525747]  __cleanup_mnt+0x12/0x20
[16418.525749]  task_work_run+0x6d/0xa0
[16418.525765]  exit_to_user_mode_prepare+0x1ad/0x1b0
[16418.525771]  syscall_exit_to_user_mode+0x27/0x50
[16418.525774]  do_syscall_64+0x48/0xc0
[16418.525776]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-03-29 16:29:25 -07:00
Chao Yu
76eea17597 f2fs: fix to avoid potential deadlock
Quoted from Jing Xia's report, there is a potential deadlock may happen
between kworker and checkpoint as below:

[T:writeback]				[T:checkpoint]
- wb_writeback
 - blk_start_plug
bio contains NodeA was plugged in writeback threads
					- do_writepages  -- sync write inodeB, inc wb_sync_req[DATA]
					 - f2fs_write_data_pages
					  - f2fs_write_single_data_page -- write last dirty page
					   - f2fs_do_write_data_page
					    - set_page_writeback  -- clear page dirty flag and
					    PAGECACHE_TAG_DIRTY tag in radix tree
					    - f2fs_outplace_write_data
					     - f2fs_update_data_blkaddr
					      - f2fs_wait_on_page_writeback -- wait NodeA to writeback here
					   - inode_dec_dirty_pages
 - writeback_sb_inodes
  - writeback_single_inode
   - do_writepages
    - f2fs_write_data_pages -- skip writepages due to wb_sync_req[DATA]
     - wbc->pages_skipped += get_dirty_pages() -- PAGECACHE_TAG_DIRTY is not set but get_dirty_pages() returns one
  - requeue_inode -- requeue inode to wb->b_dirty queue due to non-zero.pages_skipped
 - blk_finish_plug

Let's try to avoid deadlock condition by forcing unplugging previous bio via
blk_finish_plug(current->plug) once we'v skipped writeback in writepages()
due to valid sbi->wb_sync_req[DATA/NODE].

Fixes: 687de7f101 ("f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Jing Xia <jing.xia@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-03-03 16:23:55 -08:00
kondors1995
e64c7ea3b1 Merge branch 'linux-4.14.y' of github.com:jaegeuk/f2fs-stable into dev/f2fs-usptream 2022-02-17 09:26:56 +00:00
Jaegeuk Kim
a873cc61ca f2fs: add a way to limit roll forward recovery time
This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-02-11 14:12:05 -08:00
Tim Murray
07efb75333 f2fs: move f2fs to use reader-unfair rwsems
f2fs rw_semaphores work better if writers can starve readers,
especially for the checkpoint thread, because writers are strictly
more important than reader threads. This prevents significant priority
inversion between low-priority readers that blocked while trying to
acquire the read lock and a second acquisition of the write lock that
might be blocking high priority work.

Signed-off-by: Tim Murray <timmurray@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-01-19 11:27:29 -08:00
kondors1995
5a4067f18b Merge branch 'upstream-f2fs-stable-linux-4.14.y' of https://android.googlesource.com/kernel/common into 12.0 2021-12-15 12:23:59 +00:00
Jaegeuk Kim
ffae8cb088 f2fs: do not bother checkpoint by f2fs_get_node_info
This patch tries to mitigate lock contention between f2fs_write_checkpoint and
f2fs_get_node_info along with nat_tree_lock.

The idea is, if checkpoint is currently running, other threads that try to grab
nat_tree_lock would be better to wait for checkpoint.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-12-14 11:05:54 -08:00
Jaegeuk Kim
af054f705f f2fs: avoid down_write on nat_tree_lock during checkpoint
Let's cache nat entry if there's no lock contention only.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-12-14 11:05:54 -08:00
Weichao Guo
25c02f933f f2fs: set SBI_NEED_FSCK flag when inconsistent node block found
Inconsistent node block will cause a file fail to open or read,
which could make the user process crashes or stucks. Let's mark
SBI_NEED_FSCK flag to trigger a fix at next fsck time. After
unlinking the corrupted file, the user process could regenerate
a new one and work correctly.

Signed-off-by: Weichao Guo <guoweichao@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-11-11 08:50:30 -08:00
kondors1995
bfd72539b6 Merge branch 'linux-4.14.y' of github.com:jaegeuk/f2fs-stable into 11.0 2021-09-04 13:42:57 +00:00
Chao Yu
88c401eb66 f2fs: rebuild nat_bits during umount
If all free_nat_bitmap are available, we can rebuild nat_bits from
free_nat_bitmap entirely during umount, let's make another chance
to reenable nat_bits for image.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-23 17:17:21 -07:00
Daeho Jeong
cce9963715 f2fs: separate out iostat feature
Added F2FS_IOSTAT config option to support getting IO statistics through
sysfs and printing out periodic IO statistics tracepoint events and
moved I/O statistics related codes into separate files for better
maintenance.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
[Jaegeuk Kim: set default=y]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-23 17:17:19 -07:00
Chao Yu
e92c74329f f2fs: support fault injection for f2fs_kmem_cache_alloc()
This patch supports to inject fault into f2fs_kmem_cache_alloc().

Usage:
a) echo 32768 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=32768 <dev> <mountpoint>

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-23 14:40:24 -07:00
nato6613
5bd2299534 Merge remote-tracking branch 'jaegeuk/linux-4.14.y' into 11.0 2021-08-09 06:52:33 +00:00
Chao Yu
79a989e94a f2fs: extent cache: support unaligned extent
Compressed inode may suffer read performance issue due to it can not
use extent cache, so I propose to add this unaligned extent support
to improve it.

Currently, it only works in readonly format f2fs image.

Unaligned extent: in one compressed cluster, physical block number
will be less than logical block number, so we add an extra physical
block length in extent info in order to indicate such extent status.

The idea is if one whole cluster blocks are contiguous physically,
once its mapping info was readed at first time, we will cache an
unaligned (or aligned) extent info entry in extent cache, it expects
that the mapping info will be hitted when rereading cluster.

Merge policy:
- Aligned extents can be merged.
- Aligned extent and unaligned extent can not be merged.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-05 11:39:36 -07:00
Jaegeuk Kim
b76b3c3bb7 f2fs: do not submit NEW_ADDR to read node block
After the below patch, give cp is errored, we drop dirty node pages. This
can give NEW_ADDR to read node pages. Don't do WARN_ON() which gives
generic/475 failure.

Fixes: 28607bf3aa6f ("f2fs: drop dirty node pages when cp is in error status")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-04 16:28:25 -07:00
UtsavBalar1231
6be9bad061 Merge remote-tracking branch 'jaeguek/f2fs-stable/linux-4.14.y' into android11-base
* jaeguek/f2fs-stable/linux-4.14.y:
  f2fs: fix preallocation
  f2fs: fix the f2fs_file_write_iter tracepoint
  f2fs: reduce indentation in f2fs_file_write_iter()
  f2fs: rework write preallocations
  f2fs: change fiemap way in printing compression chunk
  f2fs: do not submit NEW_ADDR to read node block
  f2fs: compress: remove unneeded read when rewrite whole cluster
  f2fs: don't sleep while grabing nat_tree_lock
  f2fs: remove allow_outplace_dio()
  f2fs: make f2fs_write_failed() take struct inode
  f2fs: quota: fix potential deadlock
  f2fs: let's keep writing IOs on SBI_NEED_FSCK
  f2fs: Revert "f2fs: Fix indefinite loop in f2fs_gc() v1"
  f2fs: avoid to create an empty string as the extension_list
  f2fs: compress: fix to set zstd compress level correctly
  f2fs: add sysfs nodes to get GC info for each GC mode

Change-Id: I3f5747df6a39bf84819ee7e8fc9f8a477aab04c5
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2021-07-31 19:51:19 +05:30
Jaegeuk Kim
943859b878 f2fs: do not submit NEW_ADDR to read node block
After the below patch, give cp is errored, we drop dirty node pages. This
can give NEW_ADDR to read node pages. Don't do WARN_ON() which gives
generic/475 failure.

Fixes: 28607bf3aa6f ("f2fs: drop dirty node pages when cp is in error status")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-26 09:20:31 -07:00
Jaegeuk Kim
4e03038ae1 f2fs: don't sleep while grabing nat_tree_lock
This tries to fix priority inversion in the below condition resulting in
long checkpoint delay.

f2fs_get_node_info()
 - nat_tree_lock
  -> sleep to grab journal_rwsem by contention

                                     checkpoint
                                     - waiting for nat_tree_lock

In order to let checkpoint go, let's release nat_tree_lock, if there's a
journal_rwsem contention.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-23 15:59:15 -07:00
UtsavBalar1231
7ac526b5dd Merge remote-tracking branch 'jaeguek/f2fs-stable/linux-4.14.y' into android11-base
* jaeguek/f2fs-stable/linux-4.14.y:
  f2fs: drop dirty node pages when cp is in error status
  f2fs: initialize page->private when using for our internal use
  f2fs: compress: add nocompress extensions support

Change-Id: I353edee1e8b76408bc11eb699d67d3a46ef6a2c1
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2021-07-09 21:20:19 +05:30
Jaegeuk Kim
eab51b70c4 f2fs: drop dirty node pages when cp is in error status
Otherwise, writeback is going to fall in a loop to flush dirty inode forever
before getting SBI_CLOSING.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-07-06 22:07:28 -07:00
UtsavBalar1231
d5463506f0 Merge remote-tracking branch 'jaeguek/f2fs-stable/linux-4.14.y' into android11-base
* jaeguek/f2fs-stable/linux-4.14.y:
  Revert "f2fs: avoid attaching SB_ACTIVE flag during mount/remount"
  f2fs: remove false alarm on iget failure during GC
  f2fs: enable extent cache for compression files in read-only
  f2fs: fix to avoid adding tab before doc section
  f2fs: introduce f2fs_casefolded_name slab cache
  f2fs: swap: support migrating swapfile in aligned write mode
  f2fs: swap: remove dead codes
  f2fs: compress: add compress_inode to cache compressed blocks
  f2fs: clean up /sys/fs/f2fs/<disk>/features
  f2fs: add pin_file in feature list
  f2fs: Advertise encrypted casefolding in sysfs
  f2fs: Show casefolding support only when supported
  f2fs: support RO feature
  f2fs: logging neatening
  f2fs: restructure f2fs page.private layout
  f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit
  f2fs: compress: remove unneeded preallocation
  f2fs: avoid attaching SB_ACTIVE flag during mount/remount
  f2fs: atgc: export entries for better tunability via sysfs
  f2fs: compress: fix to disallow temp extension
  f2fs: let's allow compression for mmap files
  f2fs: add MODULE_SOFTDEP to ensure crc32 is included in the initramfs
  f2fs: return success if there is no work to do
  f2fs: compress: clean up parameter of __f2fs_cluster_blocks()
  f2fs: compress: remove unneeded f2fs_put_dnode()
  f2fs: atgc: fix to set default age threshold
  f2fs: Prevent swap file in LFS mode
  f2fs: fix to avoid racing on fsync_entry_slab by multi filesystem instances
  f2fs: add cp_error check in f2fs_write_compressed_pages
  f2fs: compress: rename __cluster_may_compress

Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
Change-Id: I5c4b919a65fe979ddb41bc484535d7be806a7de5

Conflicts:
	fs/f2fs/data.c
2021-06-26 11:30:58 +05:30
Chao Yu
2dfe592f02 f2fs: compress: add compress_inode to cache compressed blocks
Support to use address space of inner inode to cache compressed block,
in order to improve cache hit ratio of random read.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-21 16:12:01 -07:00
Chao Yu
e0ae2ee2d9 f2fs: restructure f2fs page.private layout
Restruct f2fs page private layout for below reasons:

There are some cases that f2fs wants to set a flag in a page to
indicate a specified status of page:
a) page is in transaction list for atomic write
b) page contains dummy data for aligned write
c) page is migrating for GC
d) page contains inline data for inline inode flush
e) page belongs to merkle tree, and is verified for fsverity
f) page is dirty and has filesystem/inode reference count for writeback
g) page is temporary and has decompress io context reference for compression

There are existed places in page structure we can use to store
f2fs private status/data:
- page.flags: PG_checked, PG_private
- page.private

However it was a mess when we using them, which may cause potential
confliction:
		page.private	PG_private	PG_checked	page._refcount (+1 at most)
a)		-1		set				+1
b)		-2		set
c), d), e)					set
f)		0		set				+1
g)		pointer		set

The other problem is page.flags has no free slot, if we can avoid set
zero to page.private and set PG_private flag, then we use non-zero value
to indicate PG_private status, so that we may have chance to reclaim
PG_private slot for other usage. [1]

The other concern is f2fs has bad scalability in aspect of indicating
more page status.

So in this patch, let's restructure f2fs' page.private as below to
solve above issues:

Layout A: lowest bit should be 1
| bit0 = 1 | bit1 | bit2 | ... | bit MAX | private data .... |
 bit 0	PAGE_PRIVATE_NOT_POINTER
 bit 1	PAGE_PRIVATE_ATOMIC_WRITE
 bit 2	PAGE_PRIVATE_DUMMY_WRITE
 bit 3	PAGE_PRIVATE_ONGOING_MIGRATION
 bit 4	PAGE_PRIVATE_INLINE_INODE
 bit 5	PAGE_PRIVATE_REF_RESOURCE
 bit 6-	f2fs private data

Layout B: lowest bit should be 0
 page.private is a wrapped pointer.

After the change:
		page.private	PG_private	PG_checked	page._refcount (+1 at most)
a)		11		set				+1
b)		101		set				+1
c)		1001		set				+1
d)		10001		set				+1
e)						set
f)		100001		set				+1
g)		pointer		set				+1

[1] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@casper.infradead.org/T/#u

Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-06-21 08:07:02 -07:00
UtsavBalar1231
9ffaea31f0 Merge remote-tracking branch 'jaegeuk/f2fs-stable/linux-4.14.y' into android11-base
* jaegeuk/f2fs-stable/linux-4.14.y:
  f2fs: drop inplace IO if fs status is abnormal
  f2fs: compress: remove unneed check condition
  f2fs: clean up left deprecated IO trace codes
  f2fs: avoid using native allocate_segment_by_default()
  f2fs: remove unnecessary struct declaration
  f2fs: fix to avoid NULL pointer dereference
  f2fs: avoid duplicated codes for cleanup
  f2fs: document: add description about compressed space handling
  f2fs: clean up build warnings
  f2fs: fix the periodic wakeups of discard thread
  f2fs: fix to avoid accessing invalid fio in f2fs_allocate_data_block()
  f2fs: fix to avoid GC/mmap race with f2fs_truncate()
  f2fs: set checkpoint_merge by default
  f2fs: Fix a hungtask problem in atomic write
  f2fs: fix to restrict mount condition on readonly block device
  f2fs: introduce gc_merge mount option
  f2fs: fix to cover __allocate_new_section() with curseg_lock
  f2fs: fix wrong alloc_type in f2fs_do_replace_block
  f2fs: delete empty compress.h
  f2fs: fix a typo in inode.c
  f2fs: allow to change discard policy based on cached discard cmds
  f2fs: fix to avoid touching checkpointed data in get_victim()
  f2fs: fix to update last i_size if fallocate partially succeeds
  f2fs: fix error path of f2fs_remount()
  f2fs: fix wrong comment of nat_tree_lock
  f2fs: fix to avoid out-of-bounds memory access
  f2fs: don't start checkpoint thread in readonly mountpoint
  f2fs: do not use AT_SSR mode in FG_GC & high urgent BG_GC
  f2fs: add sysfs nodes to get runtime compression stat
  f2fs: fix to use per-inode maxbytes in f2fs_fiemap
  f2fs: fix to align to section for fallocate() on pinned file
  f2fs: expose # of overprivision segments
  f2fs: fix error handling in f2fs_end_enable_verity()
  f2fs: fix a redundant call to f2fs_balance_fs if an error occurs
  f2fs: remove unused file_clear_encrypt()
  f2fs: check if swapfile is section-alligned
  f2fs: fix last_lblock check in check_swap_activate_fast
  f2fs: remove unnecessary IS_SWAPFILE check
  f2fs: Replace one-element array with flexible-array member
  f2fs: compress: Allow modular (de)compression algorithms
  f2fs: check discard command number before traversing discard pending list
  f2fs: update comments for explicit memory barrier
  f2fs: remove unused FORCE_FG_GC macro
  f2fs: avoid unused f2fs_show_compress_options()
  f2fs: fix panic during f2fs_resize_fs()
  f2fs: fix to allow migrating fully valid segment
  f2fs: fix a spelling error
  f2fs: fix a spacing coding style
  fs: Enable bmap() function to properly return errors
  f2fs: remove obsolete f2fs.txt

Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2021-05-01 17:29:30 +05:30
UtsavBalar1231
8c5242a6e2 Revert "Merge remote-tracking branch 'jaegeuk/f2fs-stable/linux-4.14.y' into android11-base"
This reverts commit b8e455bda9, reversing
changes made to e74d719fa1.

Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2021-05-01 17:29:04 +05:30
Yi Zhuang
ede9671e92 f2fs: clean up build warnings
This patch combined the below three clean-up patches.

- modify open brace '{' following function definitions
- ERROR: spaces required around that ':'
- ERROR: spaces required before the open parenthesis '('
- ERROR: spaces prohibited before that ','
- Made suggested modifications from checkpatch in reference to WARNING:
 Missing a blank line after declarations

Signed-off-by: Yi Zhuang <zhuangyi1@huawei.com>
Signed-off-by: Jia Yang <jiayang5@huawei.com>
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-12 11:23:18 -07:00
Sahitya Tummala
3ffc0565ab f2fs: allow to change discard policy based on cached discard cmds
With the default DPOLICY_BG discard thread is ioaware, which prevents
the discard thread from issuing the discard commands. On low RAM setups,
it is observed that these discard commands in the cache are consuming
high memory. This patch aims to relax the memory pressure on the system
due to f2fs pending discard cmds by changing the policy to DPOLICY_FORCE
based on the nm_i->ram_thresh configured.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-04-12 11:23:16 -07:00