Issue1327

Title merge uses wrong ancestor (and consequently does the wrong thing)
Priority bug Status chatting
Superseder Nosy List Ringding, abuehl, astratto, cboos, djc, dsp, friedrich, hstuart, jcoomes, kiilerix, kupfer, mg, mjnelson, morisgi, mpm, parren, sjtai, tonfa
Assigned To mpm Topics merge

Created on 2008-10-07.08:16:20 by Ringding, last changed 2010-02-15.17:57:44 by hstuart.

Files
File name Uploaded Type Edit Remove
playground3.rextile parren, 2009-12-28.12:57:47 text/x-rextile
Messages
msg11728 (view) Author: tonfa Date: 2010-02-15.17:03:20
Sorry about last patch, it had some errors. Here is a better version:

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -377,10 +377,42 @@
 
         r = self._filelog.renamed(self._filenode)
         if r:
-            pl[0] = (r[0], r[1], None)
+            pl[0] = (r[0], r[1], self._repo.file(r[0]))
 
-        return [filectx(self._repo, p, fileid=n, filelog=l)
-                for p, n, l in pl if n != nullid]
+        curr = self.rev()
+        cl = self._repo.changelog
+        mf = self._repo.manifest
+        for i in xrange(len(pl)):
+            path, fnode, fl = pl[i]
+            parentrev = fl.linkrev(fl.rev(fnode))
+            parentnode = cl.node(parentrev)
+            last = None
+            if fnode != nullid and not cl.descendant(parentrev, curr):
+                # duplicate linkrev, explore
+                visit = [curr]
+                seen = set()
+                while visit:
+                    n = visit.pop()
+                    if n <= parentrev or n in seen:
+                        continue
+                    seen.add(n)
+                    cldata = cl.read(parentnode)
+                    if path in cldata[3]:
+                        fn = mf.read(cldata[0]).get(path)
+                        if last is not None and fn != fnode:
+                            continue
+                        if fn == fnode:
+                            last = n
+                    for p in cl.parentrevs(n):
+                        if p != nullrev:
+                            visit.append(p)
+            if last is not None:
+                pl[i] = (path, fnode, fl, last)
+            else:
+                pl[i] = (path, fnode, fl, parentrev)
+
+        return [filectx(self._repo, p, fileid=n, changeid=cln, filelog=l)
+                for p, n, l, cln in pl if n != nullid]
 
     def children(self):
         # hard for renames
msg11727 (view) Author: tonfa Date: 2010-02-15.16:45:40
The following patch, while still a bit hackish, should make filecontext() do 
the right thing™ when calling parents()

At least for most of the cases.

diff --git a/mercurial/context.py b/mercurial/context.py
--- a/mercurial/context.py
+++ b/mercurial/context.py
@@ -377,10 +377,42 @@
 
         r = self._filelog.renamed(self._filenode)
         if r:
-            pl[0] = (r[0], r[1], None)
+            pl[0] = (r[0], r[1], self._repo.file(r[0]))
 
-        return [filectx(self._repo, p, fileid=n, filelog=l)
-                for p, n, l in pl if n != nullid]
+        curr = self.rev()
+        cl = self._repo.changelog
+        mf = self._repo.manifest
+        for i in xrange(len(pl)):
+            path, n, fl = pl[i]
+            parentrev = fl.linkrev(fl.rev(n))
+            parentnode = cl.node(parentrev)
+            last = None
+            if not cl.descendant(parentrev, curr):
+                # duplicate linkrev, explore
+                visit = [self.rev()]
+                seen = set()
+                while visit:
+                    n = visit.pop(0)
+                    if n <= parentrev or n in seen:
+                        continue
+                    seen.add(n)
+                    cldata = cl.read(parentnode)
+                    if path in cldata[3]:
+                        fn = mf.readdelta(cldata[0]).get(path)
+                        if last is not None and fn != n:
+                            continue
+                        if fn == n:
+                            last = n
+                    for p in cl.parentrevs(n):
+                        if p != nullrev:
+                            visit.append(p)
+            if last is not None:
+                pl[i] = (path, n, fl, last)
+            else:
+                pl[i] = (path, n, fl, parentrev)
+
+        return [filectx(self._repo, p, fileid=n, changeid=cln, filelog=l)
+                for p, n, l, cln in pl if n != nullid]
 
     def children(self):
         # hard for renames
msg11313 (view) Author: parren Date: 2009-12-29.12:47:41
My point is just that, given a situation where we need to merge two revs where 
one is a descendant of the other, can we not safely assume that simply 
choosing the descendant is the right thing to do? After all, given revs a and 
b with chosen common ancestor x, we are looking for a version y that 
incorporates both the changes of a vs x and of b vs x. But in this case, if a 
is a descendant of b, then a already contains the changes of b vs x, so 
choosing it seems the proper thing to do.
msg11310 (view) Author: tonfa Date: 2009-12-28.19:10:58
I think criss-cross merges usually don't have a unique correct solution.


Here is a (maybe) better picture of the example you used (since I draw
it, I might as well send it)


0 (f0) -- 1 (f1) -------- 3 (f1) ---------------------- 7 (f3)
    \  \       \        /                             /       
     \  \       `------+---------------- 5 (f3) -----' 
      \  \            /                 /             
       \  `------ 2 (f0)               / 
        \               `-------------+------- 6 (f2)
         \                           /        /
          `-------------------- 4 (f2)-------'
msg11308 (view) Author: parren Date: 2009-12-28.12:57:47
The attached file, playground3.rextile, contains a session that provokes a 
situation where hg tries to merge two file revs, one of which is an ancestor 
of the other, but fails to merge because the chosen changelog ancestor lists 
an even older rev. As per mpm's analysis of my last case, there is an 
equidistant other changelog ancestor which would work, but that is not the 
point here. I think Hg should just pick the descendant when merging two files 
where one is a descendant of the another. But maybe there are equally valid 
scenarios where this would not be the right thing to do. Ideas?
msg10971 (view) Author: kiilerix Date: 2009-11-13.19:28:30
I assume that parrens intermezzo didn't change the reasoning behind
msg10924, that the rename part still needs fixing, that mpm never changes
his mind, and that he didn't mean to switch back to testing.

I'm setting this back to chatting.
msg10970 (view) Author: mpm Date: 2009-11-13.18:49:59
Switching back to testing. Analysis of parren's report here:

http://markmail.org/message/stzw5mi763o77ckz
msg10967 (view) Author: parren Date: 2009-11-13.16:42:17
Specifically, since this change: http://selenic.com/repo/hg/rev/f153af9580fe
msg10966 (view) Author: parren Date: 2009-11-13.16:38:59
I've just posted a test to the mailing list that fails since these fixes. I 
think it should not.
http://markmail.org/thread/kidaxogur2ns75v4
msg10926 (view) Author: mpm Date: 2009-11-08.18:09:28
I've marked issue1561 as superceded by this one and copied nosy.
msg10924 (view) Author: mpm Date: 2009-11-08.18:05:45
I'm degrading this to bug, as this is now reduced to a triply-rare case:
duplicate file change with rename being backed out.

But I'm setting this back to chatting as the rename part still needs fixing.
msg10923 (view) Author: kiilerix Date: 2009-11-08.17:16:29
mpm committed a fix in f8ca4035a949 to appear in 1.4. It addresses the
reported cases in the requested way.

The ancestor of a file f in two changesets is no longer simply found in the
filelog: 

If a file with the name f is found in the changesets ancestor then that is
used - no matter if it is related or not. 

If no such file exists then the old method is used - it works in the filelog
and can follow renames but might still show the problematic behaviour
discussed here.
msg8336 (view) Author: dsp Date: 2009-01-06.01:39:14
I debugged the problem a little bit.
The reason why the merge is done that way, is that there is an ancestor lookup before 
the actually merging the files, based on their filectx/sha1. As file 'a' has the same 
sha1 in 1 and 3 (in the small example from benoit) the ancestor lookup stops with 
revision 1 as the ancestor (even though  the correct one would be 0). The file in rev 
1 has the content ab, which is the same content as in rev 3. As rev 1 is taken as the 
base content, the merge assumes that the content from 3 must be older then the one 
from 2, as it matches the base content, therefore the content of file 'a' in rev 2 is 
taken (which is a).

I guess the right way to fix this, is to go on with the ancestor search until the 
changeset ancestor is reached and take the 'latest' filectx ancestor.
msg8151 (view) Author: mjnelson Date: 2008-12-10.18:36:16
@tonfa: @ringding: per msgs 7334 and 7324: these describe what we're seeing. 
Identical changes on two different branches are causing the wrong ancestor to be
used for a merge.  The correct ancestor is identified by debugancestor, but
merge --debug shows a different selection.

It's not about trying to correctly interpret the intent of such changes, it's
about providing valid diffs for comparison.

In our case, we have a graphical log that (much crud elided) looks like this:

@  changeset:   8037:89d3a91734bc
|
o    changeset:   8036:01dbf681df64
|\
| o    changeset:   8035:9cd60bb1f301
| |\
| | o    changeset:   8034:630af5a7e4e4
| | |\
| | | o    changeset:   8033:11a3c6757b91
| | | |\
| | | | o  changeset:   8032:b56f236530b5
| | | | |
| | | | | @  changeset:   8031:dad7ab7a3786
| | | | | |

| | | | | o  changeset:   8023:faf256d5c16c

| | | | | o  changeset:   7874:0c7adbb7ddc0
| | | | | |
+---------o  changeset:   7873:f69a0edc8643
| | | | |
o | | | |  changeset:   7872:40a9434212f6
| | | | |  

The files we're concerned with were changed in 8023, 8032, 8037, and 7749 (not
shown below, but in the linear heritage of 7872).  The changes in 8032 and 8031
were identical, and share the same filelog entry.  When attempting to merge 8031
and 8037, debugancestor correctly identifies 7872, but merge chooses 8023 as the
ancestor.  So the diffs for 8031 vs 8023 are empty, even though our files were
changed on this branch, and the diffs for 8037 vs 8023 are completely bogus,
because they're not even on the same branch.
msg7822 (view) Author: Ringding Date: 2008-11-03.13:36:52
(mostly) @tonfa again: The real problem I have with Mercurial's behavior is that
a completely unrelated change 10000 lines away causes a completely different
merge result -- because when the file text does not match 100% anymore, the
ancestor calculation is correct.
msg7424 (view) Author: djc Date: 2008-10-13.14:54:48
It probably shouldn't be any higher than urgent.
msg7423 (view) Author: Ringding Date: 2008-10-13.14:52:12
Does anyone else share the opinion that this issue should have the same priority
as issue1295 (critical)?
msg7340 (view) Author: Ringding Date: 2008-10-08.09:01:10
@tonfa: True, but that doesn't relieve Mercurial of the duty of providing the
correct ancestor to the merge program.

I could also imagine using the changeset date in such a case (later == better).
msg7335 (view) Author: kiilerix Date: 2008-10-07.22:44:08
FWIW, I agree that guessing a merge result is very hard. That is why I keep
insisting that real merges never should be silent. Thus I don't think the
unification of "identical" changes we see here is good; it hides a real "merge".

(With reference to issue1295)
msg7334 (view) Author: tonfa Date: 2008-10-07.21:39:40
By the way you could use the exact same scenario (two same changes in different
branches, one branch reverts it) and expect the result we have now.
It's very hard to guess the intent from that kind of merge.
msg7331 (view) Author: kiilerix Date: 2008-10-07.20:11:07
More fun with the provided test case: continue with

hg clone -r tip . weirder
hg push weirder
hg push -f weirder
cd weirder
hg merge --debug

Notice how the different toposort of the graph causes another ancestor to be
chosen - and even one of the parents.

For reference, I posted some guesswork and suggested a solution at
http://selenic.com/pipermail/mercurial-devel/2008-October/008324.html
msg7324 (view) Author: Ringding Date: 2008-10-07.08:15:48
Background: at my company, we have a Subversion repo which I
frequently import into Mercurial via hgpullsvn. A coworker has created
a rather large patch against the svn repo a few weeks ago which I
tried to merge into the current head using hg. To my great surprise,
the merge changeset was empty which led me to the assumption that said
patch had already been merged into the svn repo (bit by bit, I
thought). I was quite baffled when I found out that this was not the
case at all. After investigating this strange case for a few hours, I
found the problem.


svn x-1
    |
    |
  svn x
    |\
    | +-----------\
    |             |
svn x+1    coworker's patch
    |
   ...
    |
    |
   <A>
    |
    |
   <~A>
    |
    |
   ...
    |
svn current (hundreds of revisions later)

The diagram above shows my situation (top is oldest, bottom is
newest). "svn x" is the revision against which my coworker's patch was
made. At one point I had the patch in my working directory and --
being used to Mercurial where it's easy to take back a commit -- I
accidentally committed the patch to the subversion repository (<A>). I
immediately reverted the commit (<~A>) but history cannot be deleted
so the mistake sits there forever.

Now when I try to merge "coworker's patch" and "svn current" in
Mercurial, apparently revision "<A>" is used as the ancestor instead
of "svn x" (that's what merge --debug says; hg debugancestor correctly
reports "svn x"). The end result is that the entire patch is
completely thrown away without any indication that something went
wrong.

Benoit Boissinot was nice enough to produce a testcase:

hg init
echo a > a
hg ci -Am m
echo b >> a
hg ci -Am m
echo a > a
hg ci -Am m
hg up 0
echo b >> a
hg ci -Am m
hg view
hg merge --debug
cat a

You'll get 'a\n' instead of 'a\nb\n'.

Original message: http://selenic.com/pipermail/mercurial/2008-October/021625.html
History
Date User Action Args
2010-02-15 17:57:44hstuartsetnosy: + hstuart
2010-02-15 17:03:20tonfasetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg11728
2010-02-15 16:45:40tonfasetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg11727
2009-12-29 12:47:42parrensetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg11313
2009-12-28 19:10:58tonfasetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg11310
2009-12-28 12:57:47parrensetfiles: + playground3.rextile
nosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg11308
2009-11-13 19:28:30kiilerixsetstatus: testing -> chatting
nosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg10971
2009-11-13 18:49:59mpmsetstatus: chatting -> testing
nosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg10970
2009-11-13 16:42:17parrensetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg10967
2009-11-13 16:38:59parrensetnosy: mpm, tonfa, cboos, kupfer, kiilerix, mg, parren, djc, abuehl, jcoomes, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg10966
2009-11-08 18:09:28mpmsetnosy: + cboos, jcoomes
messages: + msg10926
2009-11-08 18:08:25mpmlinkissue1561 superseder
2009-11-08 18:05:45mpmsetpriority: urgent -> bug
status: testing -> chatting
messages: + msg10924
nosy: mpm, tonfa, kupfer, kiilerix, mg, parren, djc, abuehl, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
2009-11-08 17:16:29kiilerixsetstatus: chatting -> testing
nosy: mpm, tonfa, kupfer, kiilerix, mg, parren, djc, abuehl, astratto, dsp, Ringding, sjtai, mjnelson, morisgi, friedrich
messages: + msg10923
2009-09-22 04:53:58parrensetnosy: + parren
2009-07-09 21:30:39morisgisetnosy: + morisgi
2009-07-09 15:56:05mgsetnosy: + mg
2009-06-25 16:31:48friedrichsetnosy: + friedrich
2009-06-24 15:50:46astrattosetnosy: + astratto
2009-02-04 15:51:02sjtaisetnosy: + sjtai
2009-01-06 01:39:16dspsetnosy: mpm, tonfa, kupfer, kiilerix, djc, abuehl, dsp, Ringding, mjnelson
messages: + msg8336
2009-01-06 01:16:51dspsetnosy: + dsp
2008-12-10 18:36:22mjnelsonsetnosy: + mjnelson
messages: + msg8151
2008-11-03 13:36:54Ringdingsetpriority: bug -> urgent
nosy: mpm, tonfa, kupfer, kiilerix, djc, abuehl, Ringding
messages: + msg7822
2008-10-18 18:50:54mpmsetnosy: + mpm
assignedto: mpm
2008-10-13 14:54:48djcsetnosy: tonfa, kupfer, kiilerix, djc, abuehl, Ringding
messages: + msg7424
2008-10-13 14:52:12Ringdingsetnosy: tonfa, kupfer, kiilerix, djc, abuehl, Ringding
messages: + msg7423
2008-10-08 20:34:06kupfersetnosy: + kupfer
2008-10-08 09:01:10Ringdingsetnosy: tonfa, kiilerix, djc, abuehl, Ringding
messages: + msg7340
2008-10-07 22:44:10kiilerixsetnosy: tonfa, kiilerix, djc, abuehl, Ringding
messages: + msg7335
2008-10-07 21:39:41tonfasetnosy: tonfa, kiilerix, djc, abuehl, Ringding
messages: + msg7334
2008-10-07 20:11:08kiilerixsetstatus: unread -> chatting
nosy: + kiilerix
messages: + msg7331
2008-10-07 11:03:20djcsetnosy: + djc
2008-10-07 09:28:40tonfasettopic: + merge
nosy: + tonfa
2008-10-07 08:28:59abuehlsetnosy: + abuehl
2008-10-07 08:16:22Ringdingcreate