Merge lp://qastaging/~gl-az/percona-xtrabackup/BT23557-2.1-lp1160788 into lp://qastaging/percona-xtrabackup/2.1

Proposed by George Ormond Lorch III
Status: Merged
Approved by: Alexey Kopytov
Approved revision: no longer in the source branch.
Merged at revision: 650
Proposed branch: lp://qastaging/~gl-az/percona-xtrabackup/BT23557-2.1-lp1160788
Merge into: lp://qastaging/percona-xtrabackup/2.1
Diff against target: 712 lines (+327/-78) (has conflicts)
12 files modified
innobackupex (+294/-38)
test/inc/ib_stream_common.sh (+9/-3)
test/inc/xb_local.sh (+7/-3)
test/t/ib_compress_basic.sh (+1/-2)
test/t/ib_stream_compress.sh (+1/-2)
test/t/ib_stream_compress_encrypt.sh (+1/-2)
test/t/xb_compress.sh (+1/-2)
test/t/xb_compress_encrypt.sh (+2/-9)
test/t/xb_encrypt.sh (+8/-0)
test/t/xb_parallel_compress.sh (+1/-2)
test/t/xb_parallel_compress_encrypt.sh (+1/-9)
test/t/xb_parallel_encrypt.sh (+1/-6)
Text conflict in innobackupex
Text conflict in test/t/xb_encrypt.sh
To merge this branch: bzr merge lp://qastaging/~gl-az/percona-xtrabackup/BT23557-2.1-lp1160788
Reviewer Review Type Date Requested Status
Alexey Kopytov (community) Approve
Vlad Lesin g2 Pending
Review via email: mp+177057@code.qastaging.launchpad.net

This proposal supersedes a proposal from 2013-07-12.

Description of the change

Added --decompress and --decrypt options both with functioning --parallel to innobackupex based on lp1160788. These options options will decrypt and/or decompress a backup made with the --compress and/or --encrypt options.

When decrypting, the encryption algorithm and key used when the backup was taken MUST be provided via the --decrypt=ALGORITHM and --encrypt-key=LITERAL-KEY or --encrypt-key-file=KEY-FILE.

For decompression to work, the qpress binary must be present within the path.

--decrypt and --decompress may be used together at the same time to completely normalize a previously compressed and encrypted backup but in some rare instances there may be io buffer overflow issues which would require calling innobackupex twice instead of a combined single call (once for decryption and once for decompression).

The --parallel option will allow multiple files to be decrypted and/or decompressed simultaneously.

Use of these options will remove the original compressed/encrypted files and leave the results in the same location.

test suite cases have been modified to make use of these new options where appropriate:
  test/t/ib_compress_basic.sh --decompress
  test/t/ib_stream_compress.sh --decompress
  test/t/ib_stream_compress_encrypt.sh --decompress
  test/t/xb_compress.sh --decompress
  test/t/xb_compress_encrypt.sh --decrypt --decompress in two individual invocations of innobackupex
  test/t/xb_encrypt.sh --decrypt
  test/t/xb_parallel_compress.sh --decompress --parallel
  test/t/xb_parallel_compress_encrypt.sh --decrypt --decompress --parallel in a single invocation of innobackupex
  test/t/xb_parallel_encrypt.sh --decrypt --parallel

-----
Rebased on trunk and fixed previous MP issue.

jenkins http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/376/
-----
Rebased on trunk again and fixed more review issues.

jenkins http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.1-param/403/

To post a comment you must log in.
Revision history for this message
George Ormond Lorch III (gl-az) wrote : Posted in a previous version of this proposal
Revision history for this message
George Ormond Lorch III (gl-az) wrote : Posted in a previous version of this proposal

Some metrics from HP Clout instance:

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 1
CPU socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 2
Stepping: 3
CPU MHz: 2666.760
BogoMIPS: 5333.52
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 4096K
NUMA node0 CPU(s): 0,1

Database was prepped with sysbench parallel-prepare, 16 tables and 5000000 rows for about ~20GB.

Backup was performed with --compress --encrypt=AES256

Serial, no parallel decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress ./backup

  real 10m50.492s
  user 9m51.688s
  sys 0m53.628s

Parallel with 2 forks decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress --parallel=2 ./backup

  real 5m38.778s
  user 9m50.381s
  sys 0m56.072s

Parallel with 4 forks decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress --parallel=4 ./backup

  real 5m38.861s
  user 9m50.900s
  sys 0m56.473s

So we see a linear improvement here with CPU availability as long as i/o can keep up

Revision history for this message
George Ormond Lorch III (gl-az) wrote : Posted in a previous version of this proposal

Some metrics from some monster machine provided by Vadim:

$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Stepping: 7
CPU MHz: 2100.000
BogoMIPS: 4199.43
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0-31

Database was prepped with sysbench parallel-prepare, 32 tables and 2500000 rows for about ~20GB.

Backup was performed with --compress --encrypt=AES256

Serial, no parallel decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress ./backup

  real 7m57.206s
  user 8m14.755s
  sys 0m33.522s

Parallel with 16 forks decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress --parallel=16 ./backup

  real 2m30.460s
  user 9m50.874s
  sys 11m32.019s

Parallel with 24 forks decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress --parallel=24 ./backup

  real 2m1.898s
  user 9m58.801s
  sys 14m12.701s

Parallel with 32 forks decrypt/decompress:
  $time innobackupex --decrypt=AES256 --encrypt-key=12345678123456781234567812345678 --decompress --parallel=32 ./backup

  real 2m4.559s
  user 10m20.132s
  sys 21m29.176s

Again, we see a nice improvement up to the point we hit i/o limits...

Revision history for this message
Stewart Smith (stewart) wrote : Posted in a previous version of this proposal

George Ormond Lorch III <email address hidden> writes:
> Some metrics from some monster machine provided by Vadim:

Blog post coming in 5..4..3..2..1 ? :)

--
Stewart Smith

Revision history for this message
Vlad Lesin (vlad-lesin) wrote : Posted in a previous version of this proposal

Looks good. Two minor remarks:

1) I think there should be test which uses --decrypt and --decompress options in one innobackupex call.

2) Lines 498-500: Why there is "DECRYPT AND DECOMPRESS" log output if only decompression is used?

review: Needs Fixing (g2)
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Posted in a previous version of this proposal

Parallel decompression/decryption are nice. However, are there plans to
add this for streaming? From the tests I see, the decompress/decrypt
are being run on non-streamed backups and since it looks for files with
extensions (qp/xbcrypt), it is not possible to use this on streaming
backups (or earlier streamed backups in case of --encrypt since
decryption needs to be done before xbstream extraction there)

Revision history for this message
George Ormond Lorch III (gl-az) wrote : Posted in a previous version of this proposal

No because qpress is not stored in a streaming compatible format.
xbcrypt can be handled via stream but not qp.
On 7/12/2013 2:09 AM, Raghavendra D Prabhu wrote:
> Parallel decompression/decryption are nice. However, are there plans to
> add this for streaming? From the tests I see, the decompress/decrypt
> are being run on non-streamed backups and since it looks for files with
> extensions (qp/xbcrypt), it is not possible to use this on streaming
> backups (or earlier streamed backups in case of --encrypt since
> decryption needs to be done before xbstream extraction there)
>

--
George O. Lorch III
Software Engineer, Percona
+1-888-401-3401 x542 US/Arizona (GMT -7)
skype: george.ormond.lorch.iii

Revision history for this message
George Ormond Lorch III (gl-az) wrote : Posted in a previous version of this proposal

I shouldn't respond until I have had my full coffee...

There are really two problems in doing this right now. One is that
inobackupex --compress only passes the --compress option(s) down into
xtrabackup where only InnoDB files get compressed, all other files
currently go into the archive uncompressed so doing something like
"... xbcrypt -d ... | xbstream -x | qpress -di ..." can't work in a
pure stream since not all files in the stream are .qp. The other is that
qpress does not store the filename/directory structure within the .qp
and as such can't recreate the same output structure as say xbcrypt and
xbstream can, it basically just a block compress/decompresser.

Maybe one day when all of the decryption | decompression | destreaming
is included in a single executable that is aware of the full structure
and format it would be possible. Alexey might have some ideas on an easy
way to pull this off within innobackupex using some of his mad linux skills.

On 7/12/2013 7:37 AM, George Ormond Lorch III wrote:
> No because qpress is not stored in a streaming compatible format.
> xbcrypt can be handled via stream but not qp.
> On 7/12/2013 2:09 AM, Raghavendra D Prabhu wrote:
>> Parallel decompression/decryption are nice. However, are there plans to
>> add this for streaming? From the tests I see, the decompress/decrypt
>> are being run on non-streamed backups and since it looks for files with
>> extensions (qp/xbcrypt), it is not possible to use this on streaming
>> backups (or earlier streamed backups in case of --encrypt since
>> decryption needs to be done before xbstream extraction there)
>>
>

--
George O. Lorch III
Software Engineer, Percona
+1-888-401-3401 x542 US/Arizona (GMT -7)
skype: george.ormond.lorch.iii

Revision history for this message
Vlad Lesin (vlad-lesin) wrote : Posted in a previous version of this proposal

1) Misprint in 477: "oarallel" instead of "parallel";

2) "DECRYPTING AND/OR DECOMPRESSING" in 524;

3) I have not found checking $option_parallel on positive value.

'parallel=i' => \$option_parallel,

It is supposed $option_parallel is integer. In the case if $option_parallel is less then 0 decrypt_decompress() will not work correctly. It seems there will be infinite loop in:

while ($freepidindex >= $option_parallel)

It should be checked somewhere.

review: Needs Fixing (g2)
Revision history for this message
Alexey Kopytov (akopytov) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches