Can you help identify ambiguous CPAN distributions?

Hello, Perl community. As I work on converting legacy CPAN Testers (CT1.0) reports to the new CPAN Testers 2.0 (CT2.0) format, I've encountered a curious conundrum and could use some volunteer help.

CT1.0 indexes reports based on the distribution name and version, e.g. "Foo-Bar-1.23". This is an unfortunate historical accident, since PAUSE does not prevent uploads with the same file name to different author directories:

  • JDOE/Foo-Bar-1.23.tar.gz
  • JQPUBLIC/Foo-Bar-1.23.tar.gz

CT2.0 will index reports based on the full unique distribution file path. I'm currently working on a heuristic to link any given legacy test report (on "Foo-Bar-1.23") with the correct distribution file path for that distribution name and version for the conversion to CT2.0.

For the most part, it works. Usually, there is only one distribution file path on BackPAN that matches. Sometimes there is more than one possibility, but I've worked out ways to resolve the ambiguity by comparing the possibilities to information in the 01mailrc files or the 02packages.details file.

But there are about 50 distribution name-version pairs on BackPAN that my heuristic fails to resolve. Since this is a one-time conversion from CT1.0 to CT2.0, all I need is a mapping file with entries like this for these ambiguous cases:

    YAML-0.39    INGY/YAML-0.39.tar.gz

If you think you can help -- either through some automated approach or just by volunteering your human brain to do some basic research to identify the "authoritative" path (e.g. historical author list in the distribution documentation files), that would be a great help for me so I can keep plugging away on the conversion code and other todos.

Even confirming that the candidates on BackPAN have the same md5 sum would be helpful since then even if we guess the wrong author, the test results are still "good" for the mistaken distribution file.

Here is the list. The name-version pair is followed by an indented list of possible paths for that pair.

Attribute-Memoize-0.01
  DANKOGAI/Attribute-Memoize-0.01.tar.gz
  MARCEL/Attribute-Memoize-0.01.tar.gz
B-Generate-1.12_03
  JCROMIE/B-Generate-1.12_03.tar.gz
  JJORE/B-Generate-1.12_03.tar.gz
Bundle-Cobalt-0.01
  HARASTY/Bundle-Cobalt-0.01.tar.gz
  JPEACOCK/Bundle-Cobalt-0.01.tar.gz
CDDB-0.9
  FONKIE/CDDB-0.9.tar.gz
  KRAEHE/CDDB-0.9.tar.gz
Catalyst-Plugin-Session-Store-File-0.07
  ESSKAR/Catalyst-Plugin-Session-Store-File-0.07.tar.gz
  KARMAN/Catalyst-Plugin-Session-Store-File-0.07.tar.gz
Catalyst-Plugin-Static-0.05
  MRAMBERG/Catalyst-Plugin-Static-0.05.tar.gz
  SRI/Catalyst-Plugin-Static-0.05.tar.gz
Catalyst-Plugin-Static-Simple-0.14
  AGRUNDMA/Catalyst-Plugin-Static-Simple-0.14.tar.gz
  MRAMBERG/Catalyst-Plugin-Static-Simple-0.14.tar.gz
Crypt-SSLeay-0.51
  CHAMAS/Crypt-SSLeay-0.51.tar.gz
  TAKESAKO/Crypt-SSLeay-0.51.tar.gz
Curses-UI-0.72
  MARCUS/Curses-UI-0.72.tar.gz
  MMAKAAY/Curses-UI-0.72.tar.gz
Curses-UI-0.73
  MARCUS/Curses-UI-0.73.tar.gz
  MMAKAAY/Curses-UI-0.73.tar.gz
DateManip-5.20
  PHOENIX/DateManip-5.20.tar.gz
  SBECK/DateManip-5.20.tar.gz
Finance-Bank-HSBC-1.04
  BISSCUITT/Finance-Bank-HSBC-1.04.tar.gz
  MWILSON/Finance-Bank-HSBC-1.04.tar.gz
Finance-Bank-HSBC-1.05
  BISSCUITT/Finance-Bank-HSBC-1.05.tar.gz
  MWILSON/Finance-Bank-HSBC-1.05.tar.gz
Locale-Object-0.73
  EMARTIN/Locale-Object-0.73.tar.gz
  FOTANGO/Locale-Object-0.73.tar.gz
MARC-0.81
  BBIRTH/MARC-0.81.tar.gz
  ESUMMERS/MARC-0.81.tar.gz
MARC-1.13
  ESUMMERS/MARC-1.13.tar.gz
  PETDANCE/MARC-1.13.tar.gz
Mail-Thread-2.41
  RCLAMP/Mail-Thread-2.41.tar.gz
  SIMON/Mail-Thread-2.41.tar.gz
Math-MatrixReal-1.1
  ANDK/Math-MatrixReal-1.1.tar.gz
  STBEY/Math-MatrixReal-1.1.tar.gz
Maypole-Authentication-Abstract-0.6
  BOBTFISH/Maypole-Authentication-Abstract-0.6.tar.gz
  SRI/Maypole-Authentication-Abstract-0.6.tar.gz
Maypole-Config-YAML-0.1
  BOBTFISH/Maypole-Config-YAML-0.1.tar.gz
  SRI/Maypole-Config-YAML-0.1.tar.gz
Maypole-Loader-0.1
  BOBTFISH/Maypole-Loader-0.1.tar.gz
  SRI/Maypole-Loader-0.1.tar.gz
Maypole-Plugin-Authentication-Abstract-0.10
  BOBTFISH/Maypole-Plugin-Authentication-Abstract-0.10.tar.gz
  SRI/Maypole-Plugin-Authentication-Abstract-0.10.tar.gz
Maypole-Plugin-Component-0.05
  BOBTFISH/Maypole-Plugin-Component-0.05.tar.gz
  SRI/Maypole-Plugin-Component-0.05.tar.gz
Maypole-Plugin-Config-YAML-0.04
  BOBTFISH/Maypole-Plugin-Config-YAML-0.04.tar.gz
  SRI/Maypole-Plugin-Config-YAML-0.04.tar.gz
Maypole-Plugin-Exception-0.03
  BOBTFISH/Maypole-Plugin-Exception-0.03.tar.gz
  SRI/Maypole-Plugin-Exception-0.03.tar.gz
Maypole-Plugin-I18N-0.02
  BOBTFISH/Maypole-Plugin-I18N-0.02.tar.gz
  SRI/Maypole-Plugin-I18N-0.02.tar.gz
Maypole-Plugin-Loader-0.03
  BOBTFISH/Maypole-Plugin-Loader-0.03.tar.gz
  SRI/Maypole-Plugin-Loader-0.03.tar.gz
Maypole-Plugin-Relationship-0.03
  BOBTFISH/Maypole-Plugin-Relationship-0.03.tar.gz
  SRI/Maypole-Plugin-Relationship-0.03.tar.gz
Maypole-Plugin-Transaction-0.02
  BOBTFISH/Maypole-Plugin-Transaction-0.02.tar.gz
  SRI/Maypole-Plugin-Transaction-0.02.tar.gz
Maypole-Plugin-Untaint-0.04
  BOBTFISH/Maypole-Plugin-Untaint-0.04.tar.gz
  SRI/Maypole-Plugin-Untaint-0.04.tar.gz
Net-DNS-0.02
  ANDK/Net-DNS-0.02.tar.gz
  MFUHR/Net-DNS-0.02.tar.gz
Net-SSH2-0.07
  AWA/AWA/Net-SSH2-0.07.tar.gz
  DBROBINS/Net-SSH2-0.07.tar.gz
NetPacket-0.04
  ATRAK/NetPacket-0.04.tar.gz
  CGANESAN/NetPacket-0.04.tar.gz
PDL-2.3.2
  CSOE/PDL-2.3.2.tar.gz
  KGB/PDL-2.3.2.tar.gz
PNGgraph-1.11
  DMOW/PNGgraph-1.11.tar.gz
  SBONDS/PNGgraph-1.11.tar.gz
POE-Session-Attributes-0.01
  CFEDDE/POE-Session-Attributes-0.01.tar.gz
  JSN/POE-Session-Attributes-0.01.tar.gz
Plucene-1.19
  SIMON/Plucene-1.19.tar.gz
  STRYTOAST/Plucene-1.19.tar.gz
RT-Extension-MergeUsers-0.02
  JESSE/RT-Extension-MergeUsers-0.02.tar.gz
  KEVINR/RT-Extension-MergeUsers-0.02.tar.gz
SNMP-1.6
  GSM/SNMP-1.6.tar.gz
  WMARQ/SNMP-1.6.tar.gz
SXIP-Membersite-1.0.0
  KGRENNAN/SXIP-Membersite-1.0.0.tar.gz
  TOKUHIROM/SXIP-Membersite-1.0.0.tar.gz
Scalar-Defer-0.13
  AUDREYT/Scalar-Defer-0.13.tar.gz
  NUFFIN/Scalar-Defer-0.13.tar.gz
Term-Prompt-0.02
  ALLENS/Term-Prompt-0.02.tar.gz
  DAZJORZ/Term-Prompt-0.02.tar.gz
Term-Prompt-0.05
  ALLENS/Term-Prompt-0.05.tar.gz
  DAZJORZ/Term-Prompt-0.05.tar.gz
Test-Warn-0.07
  BIGJ/Test-Warn-0.07.tar.gz
  MPRESSLY/Test-Warn-0.07.tar.gz
Time-0.01
  JPRIT/Time-0.01.tar.gz
  PGOLLUCCI/Time-0.01.tar.gz
Tk-Wizard-Bases-1.07
  LGODDARD/Tk-Wizard-Bases-1.07.tar.gz
  MTHURN/Tk-Wizard-Bases-1.07.tar.gz
UUID-0.03
  CFABER/UUID-0.03.tar.gz
  LZAP/UUID-0.03.tar.gz
Win32-EventLog-Carp-1.21
  IKEBE/Win32-EventLog-Carp-1.21.tar.gz
  RRWO/Win32-EventLog-Carp-1.21.tar.gz
YAML-0.39
  INGY/YAML-0.39.tar.gz
  KING/YAML-0.39.tar.gz
finance-yahooquote_0.19
  DJPADZ/finance-yahooquote_0.19.tar.gz
  EDD/finance-yahooquote_0.19.tar.gz
libapreq-1.33
  GEOFF/libapreq-1.33.tar.gz
  STAS/libapreq-1.33.tar.gz
pg95perl5-1.2.0
  MERGL/pg95perl5-1.2.0.tar.gz
  YVESP/pg95perl5-1.2.0.tar.gz
This entry was posted in cpan-testers, perl programming and tagged , . Bookmark the permalink. Both comments and trackbacks are currently closed.

8 Comments

  1. Posted February 8, 2010 at 9:28 pm | Permalink

    I uploaded TOKUHIROM/SXIP-Membersite-1.0.0.tar.gz by missed operation.
    please ignore it :P

    • dagolden
      Posted February 9, 2010 at 8:27 am | Permalink

      Thanks!

  2. Posted February 9, 2010 at 12:07 am | Permalink

    I uploaded TAKESAKO/Crypt-SSLeay-0.51.tar.gz by missed operation.
    please ignore it :-)

  3. ANDK
    Posted February 9, 2010 at 1:19 am | Permalink

    I was neither the maintainer of Math-MatrixReal-1.1.tar.gz nor of Net-DNS-0.02.tar.gz and cannot remember how these came into my directory. Please ignore them.

  4. ANDK
    Posted February 9, 2010 at 1:26 am | Permalink

    The following list identifies those that represent NO conflicts since the two alternatives above have equal SHA1 checksums:

    B-Generate-1.12_03.tar.gz Bundle-Cobalt-0.01.tar.gz CDDB-0.9.tar.gz Curses-UI-0.72.tar.gz Curses-UI-0.73.tar.gz DateManip-5.20.tar.gz MARC-0.81.tar.gz MARC-1.13.tar.gz Maypole-Authentication-Abstract-0.6.tar.gz Maypole-Config-YAML-0.1.tar.gz Maypole-Loader-0.1.tar.gz Maypole-Plugin-Authentication-Abstract-0.10.tar.gz Maypole-Plugin-Component-0.05.tar.gz Maypole-Plugin-Config-YAML-0.04.tar.gz Maypole-Plugin-Exception-0.03.tar.gz Maypole-Plugin-I18N-0.02.tar.gz Maypole-Plugin-Loader-0.03.tar.gz Maypole-Plugin-Relationship-0.03.tar.gz Maypole-Plugin-Transaction-0.02.tar.gz Maypole-Plugin-Untaint-0.04.tar.gz PDL-2.3.2.tar.gz Plucene-1.19.tar.gz Tk-Wizard-Bases-1.07.tar.gz Win32-EventLog-Carp-1.21.tar.gz YAML-0.39.tar.gz finance-yahooquote_0.19.tar.gz libapreq-1.33.tar.gz pg95perl5-1.2.0.tar.gz

    • dagolden
      Posted February 9, 2010 at 8:27 am | Permalink

      Thank you! I'm surprised it isn't more, but that's already a big help.

  5. Offer Kaye
    Posted February 10, 2010 at 8:40 am | Permalink

    Hi David,
    After removing from your list distributions that were already filtered by previous comments by TAKESAKO and ANDK, I looked at the latest version of each distribution on CPAN and looked for the author or contributor name in the documentation. Here are the results:

    Attribute-Memoize-0.01
    DANKOGAI/Attribute-Memoize-0.01.tar.gz
    MARCEL/Attribute-Memoize-0.01.tar.gz

    Both DANKOGAI and MARCEL are listed as authors in the POD. On CPAN, Attribute::Memoize current version is listed under DANKOGAI.

    Catalyst-Plugin-Session-Store-File-0.07
    ESSKAR/Catalyst-Plugin-Session-Store-File-0.07.tar.gz
    KARMAN/Catalyst-Plugin-Session-Store-File-0.07.tar.gz

    ESSKAR is listed as the author in the POD.

    Catalyst-Plugin-Static-0.05
    MRAMBERG/Catalyst-Plugin-Static-0.05.tar.gz
    SRI/Catalyst-Plugin-Static-0.05.tar.gz

    SRI is listed as the author in the POD.

    Catalyst-Plugin-Static-Simple-0.14
    AGRUNDMA/Catalyst-Plugin-Static-Simple-0.14.tar.gz
    MRAMBERG/Catalyst-Plugin-Static-Simple-0.14.tar.gz

    AGRUNDMA is listed as the author in the POD. MRAMBERG is listed as a contributor.

    Crypt-SSLeay-0.51
    CHAMAS/Crypt-SSLeay-0.51.tar.gz
    TAKESAKO/Crypt-SSLeay-0.51.tar.gz

    Joshua Chamas is listed in the POD in the ACKNOWLEDGEMENTS section as a past maintainer. This is also written in the SUPPORT section:
    "This module was originally written by Gisle Aas, and was subsequently maintained by Joshua Chamas. It is currently maintained by David Landgren."

    Finance-Bank-HSBC-1.04
    BISSCUITT/Finance-Bank-HSBC-1.04.tar.gz
    MWILSON/Finance-Bank-HSBC-1.04.tar.gz

    MWILSON is listed as the author in the POD.

    Finance-Bank-HSBC-1.05
    BISSCUITT/Finance-Bank-HSBC-1.05.tar.gz
    MWILSON/Finance-Bank-HSBC-1.05.tar.gz

    MWILSON is listed as the author in the POD.

    Locale-Object-0.73
    EMARTIN/Locale-Object-0.73.tar.gz
    FOTANGO/Locale-Object-0.73.tar.gz

    EMARTIN is listed as the author in the POD.

    Mail-Thread-2.41
    RCLAMP/Mail-Thread-2.41.tar.gz
    SIMON/Mail-Thread-2.41.tar.gz

    SIMON is listed as the author in the POD.

    Net-SSH2-0.07
    AWA/AWA/Net-SSH2-0.07.tar.gz
    DBROBINS/Net-SSH2-0.07.tar.gz

    DBROBINS is listed as the author in the POD.

    NetPacket-0.04
    ATRAK/NetPacket-0.04.tar.gz
    CGANESAN/NetPacket-0.04.tar.gz

    ATRAK is listed as an author in the POD.

    PNGgraph-1.11
    DMOW/PNGgraph-1.11.tar.gz
    SBONDS/PNGgraph-1.11.tar.gz

    I couldn't find a "PNGgraph", only a Chart::PNGgraph for which SBONDS is the author.

    POE-Session-Attributes-0.01
    CFEDDE/POE-Session-Attributes-0.01.tar.gz
    JSN/POE-Session-Attributes-0.01.tar.gz

    I cannot find this. Probably CFEDDE as I found something called POE-Session-AttributeBased under his name.

    RT-Extension-MergeUsers-0.02
    JESSE/RT-Extension-MergeUsers-0.02.tar.gz
    KEVINR/RT-Extension-MergeUsers-0.02.tar.gz

    RT-Extension-MergeUsers-0.02 is listed under JESSE on cpan search.

    SNMP-1.6
    GSM/SNMP-1.6.tar.gz
    WMARQ/SNMP-1.6.tar.gz

    The POD lists the file as being copyright GSM.

    Scalar-Defer-0.13
    AUDREYT/Scalar-Defer-0.13.tar.gz
    NUFFIN/Scalar-Defer-0.13.tar.gz

    AUDREYT is listed as the author in the POD.

    Term-Prompt-0.02
    ALLENS/Term-Prompt-0.02.tar.gz
    DAZJORZ/Term-Prompt-0.02.tar.gz
    Term-Prompt-0.05
    ALLENS/Term-Prompt-0.05.tar.gz
    DAZJORZ/Term-Prompt-0.05.tar.gz

    ALLENS is listed as a contributor in the POD.

    Test-Warn-0.07
    BIGJ/Test-Warn-0.07.tar.gz
    MPRESSLY/Test-Warn-0.07.tar.gz

    BIGJ is listed as the author in the POD.

    Time-0.01
    JPRIT/Time-0.01.tar.gz
    PGOLLUCCI/Time-0.01.tar.gz

    JPRIT is the author, but this module no longer exists on cpan (perhaps his Time::Warp replaced it)

    UUID-0.03
    CFABER/UUID-0.03.tar.gz
    LZAP/UUID-0.03.tar.gz

    CFABER is listed as an author in the POD. However there seems to be a mess-up on cpan, as http://search.cpan.org/dist/UUID/ points to version 0.02
    while the latest version (0.04) has "** UNAUTHORIZED RELEASE **" in big red letters on the dist page http://search.cpan.org/~jnh/UUID/ .

    Hope this helps,
    Offer

    • dagolden
      Posted February 10, 2010 at 8:43 am | Permalink

      Offer++ thank you very, very much! That's exactly what I needed.

      -- David

© 2009-2014 David Golden All Rights Reserved