Skip to content
Snippets Groups Projects
  1. Nov 29, 2017
  2. Nov 28, 2017
    • Alberto Garcia's avatar
      blockjob: Remove the job from the list earlier in block_job_unref() · 0a3e155f
      Alberto Garcia authored
      
      When destroying a block job in block_job_unref() we should remove it
      from the job list before calling block_job_remove_all_bdrv().
      
      This is because removing the BDSs can trigger an aio_poll() and wake
      up other jobs that might attempt to use the block job list. If that
      happens the job we're currently destroying should not be in that list
      anymore.
      
      Signed-off-by: default avatarAlberto Garcia <berto@igalia.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      0a3e155f
    • Peter Maydell's avatar
      Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-11-28' into staging · 844496f3
      Peter Maydell authored
      
      nbd patches for 2017-11-28
      
      Eric Blake - 0/2 fix two NBD server CVEs
      
      # gpg: Signature made Tue 28 Nov 2017 12:58:29 GMT
      # gpg:                using RSA key 0xA7A16B4A2527436A
      # gpg: Good signature from "Eric Blake <eblake@redhat.com>"
      # gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
      # gpg:                 aka "[jpeg image of size 6874]"
      # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A
      
      * remotes/ericb/tags/pull-nbd-2017-11-28:
        nbd/server: CVE-2017-15118 Stack smash on large export name
        nbd/server: CVE-2017-15119 Reject options larger than 32M
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      844496f3
    • Eric Blake's avatar
      nbd/server: CVE-2017-15118 Stack smash on large export name · 51ae4f84
      Eric Blake authored
      Introduced in commit f37708f6 (2.10).  The NBD spec says a client
      can request export names up to 4096 bytes in length, even though
      they should not expect success on names longer than 256.  However,
      qemu hard-codes the limit of 256, and fails to filter out a client
      that probes for a longer name; the result is a stack smash that can
      potentially give an attacker arbitrary control over the qemu
      process.
      
      The smash can be easily demonstrated with this client:
      $ qemu-io f raw nbd://localhost:10809/$(printf
      
       %3000d 1 | tr ' ' a)
      
      If the qemu NBD server binary (whether the standalone qemu-nbd, or
      the builtin server of QMP nbd-server-start) was compiled with
      -fstack-protector-strong, the ability to exploit the stack smash
      into arbitrary execution is a lot more difficult (but still
      theoretically possible to a determined attacker, perhaps in
      combination with other CVEs).  Still, crashing a running qemu (and
      losing the VM) is bad enough, even if the attacker did not obtain
      full execution control.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      51ae4f84
    • Eric Blake's avatar
      nbd/server: CVE-2017-15119 Reject options larger than 32M · fdad35ef
      Eric Blake authored
      
      The NBD spec gives us permission to abruptly disconnect on clients
      that send outrageously large option requests, rather than having
      to spend the time reading to the end of the option.  No real
      option request requires that much data anyways; and meanwhile, we
      already have the practice of abruptly dropping the connection on
      any client that sends NBD_CMD_WRITE with a payload larger than 32M.
      
      For comparison, nbdkit drops the connection on any request with
      more than 4096 bytes; however, that limit is probably too low
      (as the NBD spec states an export name can theoretically be up
      to 4096 bytes, which means a valid NBD_OPT_INFO could be even
      longer) - even if qemu doesn't permit exports longer than 256
      bytes.
      
      It could be argued that a malicious client trying to get us to
      read nearly 4G of data on a bad request is a form of denial of
      service.  In particular, if the server requires TLS, but a client
      that does not know the TLS credentials sends any option (other
      than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
      payload of nearly 4G, then the server was keeping the connection
      alive trying to read all the payload, tying up resources that it
      would rather be spending on a client that can get past the TLS
      handshake.  Hence, this warranted a CVE.
      
      Present since at least 2.5 when handling known options, and made
      worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
      to handle unknown options.
      
      CC: qemu-stable@nongnu.org
      Signed-off-by: default avatarEric Blake <eblake@redhat.com>
      fdad35ef
    • Peter Maydell's avatar
      Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-11-28-1' into staging · a914f04c
      Peter Maydell authored
      
      Merge qio 2017/11/28 v1
      
      # gpg: Signature made Tue 28 Nov 2017 10:49:08 GMT
      # gpg:                using RSA key 0xBE86EBB415104FDF
      # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
      # gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
      # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF
      
      * remotes/berrange/tags/pull-qio-2017-11-28-1:
        sockets: avoid crash when cleaning up sockets for an invalid FD
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      a914f04c
    • Daniel P. Berrangé's avatar
      sockets: avoid crash when cleaning up sockets for an invalid FD · 2d7ad7c0
      Daniel P. Berrangé authored
      
      If socket_listen_cleanup is passed an invalid FD, then querying the socket
      local address will fail. We must thus be prepared for the returned addr to
      be NULL
      
      Reported-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarDaniel P. Berrange <berrange@redhat.com>
      2d7ad7c0
    • Peter Maydell's avatar
      Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging · c7e1f823
      Peter Maydell authored
      
      # gpg: Signature made Tue 28 Nov 2017 03:58:11 GMT
      # gpg:                using RSA key 0xEF04965B398D6211
      # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
      # gpg: WARNING: This key is not certified with sufficiently trusted signatures!
      # gpg:          It is not certain that the signature belongs to the owner.
      # Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211
      
      * remotes/jasowang/tags/net-pull-request:
        virtio-net: don't touch virtqueue if vm is stopped
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      c7e1f823
    • Jason Wang's avatar
      virtio-net: don't touch virtqueue if vm is stopped · 70e53e6e
      Jason Wang authored
      
      Guest state should not be touched if VM is stopped, unfortunately we
      didn't check running state and tried to drain tx queue unconditionally
      in virtio_net_set_status(). A crash was then noticed as a migration
      destination when user type quit after virtqueue state is loaded but
      before region cache is initialized. In this case,
      virtio_net_drop_tx_queue_data() tries to access the uninitialized
      region cache.
      
      Fix this by only dropping tx queue data when vm is running.
      
      Fixes: 283e2c2a ("net: virtio-net discards TX data after link down")
      Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: qemu-stable@nongnu.org
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      70e53e6e
  3. Nov 27, 2017
    • Kashyap Chamarthy's avatar
      QAPI & interop: Clarify events emitted by 'block-job-cancel' · c117bb14
      Kashyap Chamarthy authored
      
      When you cancel an in-progress 'mirror' job (or "active `block-commit`")
      with QMP `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.
      However, when `block-job-cancel` is issued *after* `drive-mirror` has
      indicated (via the event BLOCK_JOB_READY) that the source and
      destination have reached synchronization:
      
          [...] # Snip `drive-mirror` invocation & outputs
          {
            "execute":"block-job-cancel",
            "arguments":{
              "device":"virtio0"
            }
          }
      
          {"return": {}}
      
      It (`block-job-cancel`) will counterintuitively emit the event
      'BLOCK_JOB_COMPLETED':
      
          {
            "timestamp":{
              "seconds":1510678024,
              "microseconds":526240
            },
            "event":"BLOCK_JOB_COMPLETED",
            "data":{
              "device":"virtio0",
              "len":41126400,
              "offset":41126400,
              "speed":0,
              "type":"mirror"
            }
          }
      
      But this is expected behaviour, where the _COMPLETED event indicates
      that synchronization has successfully ended (and the destination now has
      a point-in-time copy, which is at the time of cancel).
      
      So add a small note to this effect in 'block-core.json'.  While at it,
      also update the "Live disk synchronization -- drive-mirror and
      blockdev-mirror" section in 'live-block-operations.rst'.
      
      (Thanks: Max Reitz for reminding me of this caveat on IRC.)
      
      Signed-off-by: default avatarKashyap Chamarthy <kchamart@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      c117bb14
    • Peter Maydell's avatar
      Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.11-20171127' into staging · 5e19aed5
      Peter Maydell authored
      
      ppc patch queue 2017-11-27
      
      This series contains a couple of migration fixes for hash guests on
      POWER9 radix MMU hosts.
      
      # gpg: Signature made Mon 27 Nov 2017 04:27:15 GMT
      # gpg:                using RSA key 0x6C38CACA20D9B392
      # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
      # gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
      # gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
      # gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
      # Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392
      
      * remotes/dgibson/tags/ppc-for-2.11-20171127:
        target/ppc: Fix setting of cpu->compat_pvr on incoming migration
        target/ppc: Move setting of patb_entry on hash table init
      
      Signed-off-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      5e19aed5
    • Fam Zheng's avatar
      qemu-options: Mention locking option of file driver · 1878eaff
      Fam Zheng authored
      
      Signed-off-by: default avatarFam Zheng <famz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      1878eaff
    • Fam Zheng's avatar
      docs: Add image locking subsection · b1d1cb27
      Fam Zheng authored
      
      This documents the image locking feature and explains when and how
      related options can be used.
      
      Signed-off-by: default avatarFam Zheng <famz@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      b1d1cb27
    • John Snow's avatar
      iotests: fix 075 and 078 · 45f1882a
      John Snow authored
      
      Both of these tests are for formats which now stipulate that they are
      read-only. Adjust the tests to match.
      
      Signed-off-by: default avatarJohn Snow <jsnow@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Reviewed-by: default avatarLukáš Doktor <ldoktor@redhat.com>
      Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
      45f1882a
    • Suraj Jitindar Singh's avatar
      target/ppc: Fix setting of cpu->compat_pvr on incoming migration · e07cc192
      Suraj Jitindar Singh authored
      
      cpu->compat_pvr is used to store the current compat mode of the cpu.
      
      On the receiving side during incoming migration we check compatibility
      with the compat mode by calling ppc_set_compat(). However we fail to set
      the compat mode with the hypervisor since the "new" compat mode doesn't
      differ from the current (due to a "cpu->compat_pvr != compat_pvr" check).
      This means that kvm runs the vcpus without a compat mode, which is the
      incorrect behaviour. The implication being that a compatibility mode
      will never be in effect after migration.
      
      To fix this so that the compat mode is correctly set with the
      hypervisor, store the desired compat mode and reset cpu->compat_pvr to
      zero before calling ppc_set_compat().
      
      Fixes: 5dfaa532 ("ppc: fix ppc_set_compat() with KVM PR")
      
      Signed-off-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      e07cc192
    • Suraj Jitindar Singh's avatar
      target/ppc: Move setting of patb_entry on hash table init · ee4d9ecc
      Suraj Jitindar Singh authored
      
      The patb_entry is used to store the location of the process table in
      guest memory. The msb is also used to indicate the mmu mode of the
      guest, that is patb_entry & 1 << 63 ? radix_mode : hash_mode.
      
      Currently we set this to zero in spapr_setup_hpt_and_vrma() since if
      this function gets called then we know we're hash. However some code
      paths, such as setting up the hpt on incoming migration of a hash guest,
      call spapr_reallocate_hpt() directly bypassing this higher level
      function. Since we assume radix if the host is capable this results in
      the msb in patb_entry being left set so in spapr_post_load() we call
      kvmppc_configure_v3_mmu() and tell the host we're radix which as
      expected means addresses cannot be translated once we actually run the cpu.
      
      To fix this move the zeroing of patb_entry into spapr_reallocate_hpt().
      
      Signed-off-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      ee4d9ecc
  4. Nov 24, 2017
  5. Nov 23, 2017
  6. Nov 22, 2017
    • Daniel Henrique Barboza's avatar
      migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END · acab30b8
      Daniel Henrique Barboza authored
      
      When migrating a VM with 'migrate_set_capability postcopy-ram on'
      a postcopy_state is set during the process, ending up with the
      state POSTCOPY_INCOMING_END when the migration is over. This
      postcopy_state is taken into account inside ram_load to check
      how it will load the memory pages. This same ram_load is called when
      in a loadvm command.
      
      Inside ram_load, the logic to see if we're at postcopy_running state
      is:
      
      postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING
      
      postcopy_state_get() returns this enum type:
      
      typedef enum {
          POSTCOPY_INCOMING_NONE = 0,
          POSTCOPY_INCOMING_ADVISE,
          POSTCOPY_INCOMING_DISCARD,
          POSTCOPY_INCOMING_LISTENING,
          POSTCOPY_INCOMING_RUNNING,
          POSTCOPY_INCOMING_END
      } PostcopyState;
      
      In the case where ram_load is executed and postcopy_state is
      POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and
      ram_load will behave like a postcopy is in progress. This scenario isn't
      achievable in a migration but it is reproducible when executing
      savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm
      to fail with Error -22:
      
      Source:
      
      (qemu) migrate_set_capability postcopy-ram on
      (qemu) migrate tcp:127.0.0.1:4444
      
      Dest:
      
      (qemu) migrate_set_capability postcopy-ram on
      (qemu)
      ubuntu1704-intel login:
      Ubuntu 17.04 ubuntu1704-intel ttyS0
      
      ubuntu1704-intel login: (qemu)
      (qemu) savevm test1
      (qemu) loadvm test1
      Unknown combination of migration flags: 0x4 (postcopy mode)
      error while loading state for instance 0x0 of device 'ram'
      Error -22 while loading VM state
      (qemu)
      
      This patch fixes this problem by changing the existing logic for
      postcopy_advised and postcopy_running in ram_load, making them
      'false' if we're at POSTCOPY_INCOMING_END state.
      
      Signed-off-by: default avatarDaniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
      CC: Juan Quintela <quintela@redhat.com>
      CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Reported-by: default avatarBalamuruhan S <bala24@linux.vnet.ibm.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      acab30b8
    • Laurent Vivier's avatar
      ppc: fix VTB migration · 6dd836f5
      Laurent Vivier authored
      
      Migration of a system under stress (for example, with
      "stress-ng --numa 2") triggers on the destination
      some kernel watchdog messages like:
      
      NMI watchdog: BUG: soft lockup - CPU#0 stuck for 3489660870s!
      NMI watchdog: BUG: soft lockup - CPU#1 stuck for 3489660884s!
      
      This problem appears with the changes introduced by
          42043e4f spapr: clock should count only if vm is running
      
      I think this commit only triggers the problem.
      
      Kernel computes the soft lockup duration using the
      Virtual Timebase register (VTB), not using the Timebase
      Register (TBR, the one 42043e4f stops).
      
      It appears VTB is not migrated, so this patch adds it in
      the list of the SPRs to migrate, and fixes the problem.
      
      For the migration, I've tested a migration from qemu-2.8.0 and
      pseries-2.8.0 to a patched master (qemu-2.11.0-rc1). The received
      VTB is 0 (as is it not initialized by qemu-2.8.0), but the value
      seems to be ignored by KVM and a non zero VTB is used by the kernel.
      I have no explanation for that, but as the original problem appears
      only with SMP system under stress I suspect some problems in KVM
      (I think because VTB is shared by all threads of a core).
      
      Signed-off-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      6dd836f5
    • David Gibson's avatar
      spapr: Implement bug in spapr-vty device to be compatible with PowerVM · 6c3bc244
      David Gibson authored
      
      The spapr-vty device implements the PAPR defined virtual console,
      which is also implemented by IBM's proprietary PowerVM hypervisor.
      
      PowerVM's implementation has a bug where it inserts an extra \0 after
      every \r going to the guest.  Because of that Linux's guest side
      driver has a workaround which strips \0 characters that appear
      immediately after a \r.
      
      That means that when running under qemu, sending a binary stream from
      host to guest via spapr-vty which happens to include a \r\0 sequence
      will get corrupted by that workaround.
      
      To deal with that, this patch duplicates PowerVM's bug, inserting an
      extra \0 after each \r.  Ugly, but the best option available.
      
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: default avatarThomas Huth <thuth@redhat.com>
      Reviewed-by: default avatarGreg Kurz <groug@kaod.org>
      6c3bc244
    • Thomas Huth's avatar
      hw/ppc/spapr: Fix virtio-scsi bootindex handling for LUNs >= 256 · bac658d1
      Thomas Huth authored
      LUNs >= 256 have to be encoded with the so-called "flat space
      addressing method" for virtio-scsi, where an additional bit has to
      be set. SLOF already took care of this with the following commit:
      
       https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da
       (see https://bugzilla.redhat.com/show_bug.cgi?id=1431584
      
       for details)
      
      But QEMU does not use this encoding yet for device tree paths
      that have to be handed over to SLOF to deal with the "bootindex"
      property, so SLOF currently fails to boot from virtio-scsi devices
      with LUNs >= 256 in the right boot order. Fix it by using the bit
      to indicate the "flat space addressing method" for LUNs >= 256.
      
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      bac658d1
  7. Nov 21, 2017
Loading