Issue2102

Title hg export and hg import speak different languages
Priority bug Status resolved
Superseder Nosy List abuehl, brendan, durin42, mg, mpm, pmezard, rsc, tonfa
Assigned To Topics

Created on 2010-03-18.07:09:51 by rsc, last changed 2012-02-01.00:19:40 by mpm.

Files
File name Uploaded Type Edit Remove
unnamed durin42, 2010-03-18.14:23:19 text/html
unnamed durin42, 2010-03-18.14:57:24 text/html
Messages
msg12108 (view) Author: brendan Date: 2010-03-20.21:00:10
See http://hg.intevation.org/mercurial/crew/rev/fb06e357e698
(patch: more precise NoHunk, raised for every file (issue2102))
msg12088 (view) Author: tonfa Date: 2010-03-18.18:58:15
The following two changesets from stable should improve it:

changeset:   10729:7a5931c5f2dc
branch:      stable
user:        Benoit Boissinot <benoit.boissinot@ens-lyon.org>
date:        Thu Mar 18 18:22:34 2010 +0100
summary:     patch: enhance diff detection regexp, allow '--- ' in patch 
message

changeset:   10730:4d6bd7b8b6d8
branch:      stable
tag:         tip
user:        Benoit Boissinot <benoit.boissinot@ens-lyon.org>
date:        Thu Mar 18 19:26:56 2010 +0100
summary:     mq: allow lines starting with '--- ' in patch messages
msg12087 (view) Author: tonfa Date: 2010-03-18.15:48:09
Augie: with the fix to 403, and if I remove the '---' from the commit 
message in 2006, then after qpush -a the working dir are identical (checked 
with md5sum, and comparison of hg manifest -v).

In any case, we should *never* use the external patcher for git patches.
msg12086 (view) Author: durin42 Date: 2010-03-18.14:57:24
Fair enough, but Russ and I have both observed incorrect patch application
that is definitely showing bugs in hunk detection.

On Mar 18, 2010 7:31 AM, "Benoit Boissinot" <bugs@mercurial.selenic.com>
wrote:

Benoit Boissinot <bboissin@gmail.com> added the comment:
# Date 1215132065 25200
-       a,b := fun()
+a,b := fun()

The diff above is in the commit message.

Continuing my journey with mq, the next failure after 403 is 2006, where the
patch is detected as empty because of the '---' in the commit message.

____________________________________________________
Mercurial issue tracker <bugs@mercurial.seleni...
msg12085 (view) Author: tonfa Date: 2010-03-18.14:31:22
# Date 1215132065 25200
-	a,b := fun()
+a,b := fun()


The diff above is in the commit message.


Continuing my journey with mq, the next failure after 403 is 2006, where the 
patch is detected as empty because of the '---' in the commit message.
msg12084 (view) Author: durin42 Date: 2010-03-18.14:23:19
Not worried about hash changes. This is actual different application errors.

On Mar 18, 2010 7:16 AM, "Benoit Boissinot" <bugs@mercurial.selenic.com>
wrote:

Benoit Boissinot <bboissin@gmail.com> added the comment:
Augie: Yeah, we do strip of commit message, while converters/mq don't, hence
the difference.

____________________________________________________
Mercurial issue tracker <bugs@mercurial.seleni...
msg12083 (view) Author: tonfa Date: 2010-03-18.14:16:26
Augie: Yeah, we do strip of commit message, while converters/mq don't, hence 
the difference.
msg12082 (view) Author: tonfa Date: 2010-03-18.14:13:12
With:

diff --git a/mercurial/patch.py b/mercurial/patch.py
--- a/mercurial/patch.py
+++ b/mercurial/patch.py
@@ -1025,7 +1028,6 @@
                 current_hunk.fix_newline()
             yield 'hunk', current_hunk
             current_hunk = None
-            gitworkdone = False
         if ((sourcefile or state == BFILE) and ((not context and x[0] == '@') or
             ((context is not False) and x.startswith('***************')))):
             try:


403, pass correctly.

The problem is that NoHunk was raised without reason, and external patch was used (and external patch is buggy 
with git diffs).
msg12081 (view) Author: durin42 Date: 2010-03-18.14:01:08
I've got a reproduction script that verifies the diffs apply correctly. I've 
seen problems in diff application earlier than r403 in go - that's just the 
diff that fails to apply because an earlier diff was improperly imported.

# assumes go is a sibling of the script
% cat apply-go-patches.sh
hg init go-patchy || exit 1
cd go-patchy
for p in $(hg -R ../go log --rev 0:tip --template '{rev}\n') ; do
    echo "revision $p"
    hg -R ../go export -r $p > orig
    hg import orig || break
    hg export tip | egrep -v '^# (Node ID|Parent)' > new
    cat orig | egrep -v '^# (Node ID|Parent)' > orig-filtered
    diff -u orig-filtered new || break
done

revision 286
applying orig
--- orig-filtered	2010-03-18 08:56:32.000000000 -0500
+++ new	2010-03-18 08:56:32.000000000 -0500
@@ -1,7 +1,7 @@
 # HG changeset patch
 # User Ken Thompson <ken@golang.org>
 # Date 1215132065 25200
-	a,b := fun()
+a,b := fun()
 
 SVN=125998


It's worth noting that when I first started taking a look at this a couple 
of days ago, I was seeing revision 5 of the go repository not import 
reliably - patch.iterhunks() wasn't always yielding all of the hunks of the 
patch, but I couldn't reproduce that failure just now.
msg12079 (view) Author: tonfa Date: 2010-03-18.13:00:15
Thanks for the report Russ.

To reproduce simply (using mq):
hg clone https://go.googlecode.com/hg go1
cd go1
hg qimport -r 0:tip
hg qpop -a
hg qpush -a

fails at applying 403.diff, which look like:
# HG changeset patch
# User Rob Pike <r@golang.org>
# Date 1216685449 25200
# Node ID 03aa2b206f499ad6eb50e6e207b9e710d6409c98
# Parent  93d10138ad8df586827ca90b4ddb5033e21a3a84
help management of empty pkg and lib directories in perforce

R=gri
DELTA=4  (4 added, 0 deleted, 0 changed)
OCL=13328
CL=13328

diff --git a/lib/place-holder b/lib/place-holder
new file mode 100644
--- /dev/null
+++ b/lib/place-holder
@@ -0,0 +1,2 @@
+perforce does not maintain empty directories.
+this file helps.
diff --git a/pkg/place-holder b/pkg/place-holder
new file mode 100644
--- /dev/null
+++ b/pkg/place-holder
@@ -0,0 +1,2 @@
+perforce does not maintain empty directories.
+this file helps.
diff --git a/src/cmd/gc/mksys.bash b/src/cmd/gc/mksys.bash
old mode 100644
new mode 100755
msg12074 (view) Author: rsc Date: 2010-03-18.09:13:57
I don't really need to generate a hash-identical repository.
The export --git output should enough information to at least create
a plausibly similar repository, at least for a straight line repository
like https://go.googlecode.com/hg.  And yet it can't.  This implies
that for some individual patch (#404, it turns out, below),
hg export --git of a single change can't be pulled in with hg import.

In fact, I know that the export --git output does have enough useful
data to take the patch sequence and recreate a repository similar to
the original.  I did just that when I was working on creating the
initial copy of that repository, but I had to use an external program
that I wrote to process the export --git output, because import wasn't
up to the job.
msg12073 (view) Author: mpm Date: 2010-03-18.08:36:05
That's never going to work for the hg repository. There's not enough info in
the exported patches to precisely reconstruct the repository with the same
hashes, even with --git. This is due to minor historical details in repo
encoding like which newlines are stripped, how dates are encoded, etc.

Export/import can round-trip most individual patches just fine and with
--exact, will properly recreate merges, etc., provided the target contains
the precise parent hashes needed.

But it won't ever be 100% for hash-perfect round-tripping every changeset as
there are too many details and things that even --git patches can't
represent. If you need perfect fidelity, you need to use bundle/unbundle and
push/pull.
msg12072 (view) Author: rsc Date: 2010-03-18.07:09:51
It would be nice if hg export's output could be read with hg import reliably.

$ hg clone http://selenic.com/repo/hg
destination directory: hg
requesting all changes
adding changesets
adding manifests
adding file changes
added 10725 changesets with 21415 changes to 1481 files
updating working directory
1146 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ mkdir /tmp/hgx
hg -R /tmp/hg export -o /tmp/hgx/patch.%n  0:
$ mkdir /tmp/hg1
$ cd /tmp/hg1
$ hg init .
$ hg import /tmp/hgx/patch.*
applying /tmp/hgx/patch.00001
applying /tmp/hgx/patch.00002
...
applying /tmp/hgx/patch.00074
applying /tmp/hgx/patch.00075
[editor popped up to ask for CL description!]
applying /tmp/hgx/patch.00076
applying /tmp/hgx/patch.00077
...
applying /tmp/hgx/patch.00102
applying /tmp/hgx/patch.00103
patching file hgweb.py
Hunk #1 FAILED at 23
Hunk #2 FAILED at 221
2 out of 2 hunks FAILED -- saving rejects to file hgweb.py.rej
abort: patch failed to apply
$  

Adding --git to the export doesn't help.  In this example the
import behaves exactly the same, but sometimes it behaves
differently.

I picked http://selenic.com/repo/hg because it is an example
I thought would be close to home.  The actual repo I was originally
trying this with was http://go.googlecode.com/hg/.

$ hg clone https://go.googlecode.com/hg go1
requesting all changes
adding changesets
adding manifests
adding file changes
added 5071 changesets with 21073 changes to 3408 files
updating working directory
2047 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ mkdir gox
$ hg -R go1 export -o gox/patch.%n 0:
$ hg init go2
$ hg -R go2 import gox/patch.*
applying gox/patch.0001
applying gox/patch.0002
...
applying gox/patch.0045
applying gox/patch.0046
abort: no diffs found
$ 

Okay, let's try with git mode, since patch.0046 corresponds
to a pure permission bit change, with no actual diffs.
It gets past #0046, but now it doesn't like #0404.

$ hg -R go1 export --git -o gox/gpatch.%n 0:
$ hg init go3
$ hg -R go3 import gox/gpatch.*
applying gox/gpatch.0001
applying gox/gpatch.0002
...
applying gox/gpatch.0403
applying gox/gpatch.0404
1 out of 1 hunk ignored -- saving rejects to file lib/place-holder.rej
abort: patch command failed: exited with status 1
$ 

This is all with Mercurial 1.3.1, though it breaks in enough
different ways that it seems likely at least one of them is still
broken in the latest Mercurial.  This would probably be a good
command sequence to include in a torture test to run before a
release.
History
Date User Action Args
2012-02-01 00:19:40mpmsetstatus: chatting -> resolved
nosy: mpm, tonfa, brendan, pmezard, mg, abuehl, durin42, rsc
2010-03-20 21:00:10brendansetnosy: mpm, tonfa, brendan, pmezard, mg, abuehl, durin42, rsc
messages: + msg12108
2010-03-18 23:47:05brendansetnosy: + brendan
2010-03-18 18:58:15tonfasetnosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12088
2010-03-18 15:48:09tonfasetnosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12087
2010-03-18 14:57:24durin42setfiles: + unnamed
nosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12086
2010-03-18 14:31:22tonfasetnosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12085
2010-03-18 14:23:19durin42setfiles: + unnamed
nosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12084
2010-03-18 14:16:26tonfasetnosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12083
2010-03-18 14:13:12tonfasetnosy: mpm, tonfa, pmezard, mg, abuehl, durin42, rsc
messages: + msg12082
2010-03-18 14:01:08durin42setnosy: + durin42
messages: + msg12081
2010-03-18 13:00:15tonfasetnosy: + pmezard, tonfa
messages: + msg12079
2010-03-18 10:08:47mgsetnosy: + mg
2010-03-18 09:13:57rscsetmessages: + msg12074
2010-03-18 08:36:05mpmsetstatus: unread -> chatting
nosy: + mpm
messages: + msg12073
2010-03-18 08:03:35abuehlsetnosy: + abuehl
2010-03-18 07:09:51rsccreate