Stayman with 44M less than invite

4 Pages
←
1
2
3
4
→

You cannot start a new topic
You cannot reply to this topic

Stayman with 44M less than invite Yes? No? Maybe?

#41 rhm

Group: Advanced Members
Posts: 3,092
Joined: 2005-June-27

Posted 2012-July-02, 06:51

gnasher, on 2012-July-02, 05:44, said:

I've just written some code. I know what it's supposed to do, I'm expecting it to work, and the compiler seems to like it. However, one of the things I'm going to do is to run it, look at some of the results, and compare them to what I'm actually trying to achieve. I think most writers of software would regard that as a normal thing to do.

If I were doing a double-dummy simulation, I would do the equivalent: I'd look at some of the hands, and consider (a) whether they were consistent with the auction I was trying to simulate, and (b) whether the double-dummy results were consistent with real-life expectations. If I were trying to persuade somebody else that my double-dummy simulation accurately modelled real-life bridge, I would invite them to do the same. I don't really understand the rationale for not doing this.

Oh I do inspect a few of the generated deals, mainly to check whether I have overlooked something and have to refine my specifications and because the deals in themselves are often interesting.
I do not use my own code but Dealmaster PRO and I found it to be reliable. The DD analyzer (DeepFinesse) is beyond doubt.
But all this is a completely different issue.
I think I can export generated deals out of Dealmaster PRO, but what do you want me to do with them?
Publish them all in BBO? Seems to me impractical to do. My sample size is usually 1000 deals, unless generating will take ages.
But why would we need to hide anything?
Best way to check results is for somebody else to repeat them with their own software and sometimes with the same or their own specifications.
Whenever I published results and others used similar specifications in their simulations they came to similar results.
But frankly I do not understand all this skepticism of their validity, as if people had any incentive of making false or careless claims.
Simulation results are sometimes surprising and refute "standard wisdom". That's why I like them. Others seem to hate them for that reason.
I wonder who has an open mind here.

Rainer Herrmann

#42 cherdano

5555

Group: Advanced Members
Posts: 9,519
Joined: 2003-September-04
Gender:Male

Posted 2012-July-02, 06:55

rhm, on 2012-July-02, 04:45, said:

I am not aware that this particular issue has been researched specifically.
However millions of deals played have been researched and differentiated according to contract level and whether played in a suit or in notrump.

There were certainly plenty of 4-3 fits in partials when lower-level suit contracts were researched.
Double dummy makes very slightly less tricks (About 0.1 tricks at suit contracts below game level).

Rainer Herrmann

If the average of A and the average of B over the whole population are exactly the same, then that does not mean that the average of A and the average of B over a subset of the population is exactly the same.

The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke

#43 gnasher

Andy Bowles

Group: Advanced Members
Posts: 11,993
Joined: 2007-May-03
Gender:Male
Location:London, UK

Posted 2012-July-02, 07:46

rhm, on 2012-July-02, 06:51, said:

I think I can export generated deals out of Dealmaster PRO, but what do you want me to do with them?
Publish them all in BBO? Seems to me impractical to do. My sample size is usually 1000 deals, unless generating will take ages.

No, I'd expect you to produce a few examples. You say that you look at a few of the generated deals. Next time you do a simulation, why not share those deals as well as the conclusions?

Quote

But frankly I do not understand all this skepticism of their validity, as if people had any incentive of making false or careless claims.

You do know that this is the Internet, don't you?

Quote

Simulation results are sometimes surprising and refute "standard wisdom". That's why I like them. Others seem to hate them for that reason.
I wonder who has an open mind here.

I have an open mind, but I tend to be sceptical of assertions that aren't supported by either logic or evidence. I'm not saying that this applies to your simulations, but in this forum generally there is a culture of saying "I did a simulation. I'm not going to tell you how I did it, I'm not going to show you the actual hands, and I'm not going to offer any evidence that I know what I'm doing." That's not qualitatively different from telling us that sugar pills will cure cancer.

... that would still not be conclusive proof, before someone wants to explain that to me as well as if I was a 5 year-old. - gwnn

#44 rhm

Group: Advanced Members
Posts: 3,092
Joined: 2005-June-27

Posted 2012-July-02, 07:49

cherdano, on 2012-July-02, 06:55, said:

True, so what?

Very likely DD simulation is way off on 4-3 fits and also way off on 5-3 fits but in opposite direction and the whole cancel each other miraculously out
I am not a missionary and I am not interested in convincing you to believe the earth is round if you want to believe the earth to be flat.
Continue dreaming if you like.

Rainer Herrmann

#45 cherdano

5555

Group: Advanced Members
Posts: 9,519
Joined: 2003-September-04
Gender:Male

Posted 2012-July-02, 08:14

I am actually quite open to the idea that garbage stayman should be used more often than it is.
But the idea that this is proven with the simulation data you linked to is laughable.

I have played a 4-3 fit or two in my life. And I know that (especially when our combined hands are fairly weak) I would really LOVE to know whether trumps split 3-3 or whether they split worse.
I also know that defending a 4-3 fit isn't easy, but it's much easier when you know it's a 4-3 fit from the beginning.

I would be happy to make a bet that double dummy results for declarer significantly outperform single dummy results in this situation (low-level partial, defense knows from the beginning that it's a 4-3 fit - say the auction was 1N-2C-2D-2H-2S).
I also know that defense again an unrevealing auction (1NT all pass) is harder than on average.

The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke

#46 fromageGB

Group: Advanced Members
Posts: 2,679
Joined: 2008-April-06

Posted 2012-July-02, 08:24

I agree with what Cherdano is saying. I don't place much faith in DD simulations because normally real life is nothing like DD. If there were single dummy simulations, then that is a different matter.

#47 Cthulhu D

Group: Advanced Members
Posts: 1,169
Joined: 2011-November-21
Gender:Not Telling
Location:Australia
Interests:Overbidding

Posted 2012-July-02, 20:10

fromageGB, on 2012-July-02, 08:24, said:

Is it? We always have hand records with deep finesse trick totals printed on them at the end of every club (team and pairs) game, and when we go through the results of a MP game against the deep finesse results most contracts come up the same, and of those that don't, 60-70%+ are the result of clear defensive errors.

The percentage of hands where deep finesse can roll it home taking some absurd line of play is very small.

If contract selection was being done by Deep Finesse that's a different question. It can often see that you want to be in some absurd 6C= in your 5-1 club fit rather than 4S+1 in your 9 card spade fit or whatever, but when you consider the strain selected by the room the results are usually pretty accurate.

#48 awm

Group: Advanced Members
Posts: 8,373
Joined: 2005-February-09
Gender:Male
Location:Zurich, Switzerland

Posted 2012-July-02, 20:50

Cthulhu D, on 2012-July-02, 20:10, said:

This is often true in game and slam level contracts. I've found it to be much less so in partials. The various statistics seem to back that up. 1NT is a very difficult contract to defend, especially when declarer's hand is mostly unknown. Often the wrong opening lead can blow the contract, or the wrong very early switch. 4-3 major fits on fairly balanced hands are tough to play however; often you have to guess whether to draw trump early (which is usually right if the trumps divide 3-3 and disastrous if they don't).

Adam W. Meyerson
a.k.a. Appeal Without Merit

#49 Siegmund

Alchemist

Group: Advanced Members
Posts: 1,764
Joined: 2004-June-15
Gender:Male
Location:Beside a little lake in northwestern Montana
Interests:Creator of the 'grbbridge' LaTeX typesetting package.

Posted 2012-July-02, 21:29

I don't have any hard numbers on double-vs-single dummy for 4-3 partscore fits, specifically. That is true. Though it's not obvious to me that they should be wildly different, at least in a systematic way (in the case of NT vs suit comparions we are talking about here, it's the difference between two double-dummy results that matters, and it is likely that with a 3-3 break it is right to pull trumps in a suit AND right to peel the whole suit in NT while with a bad break it is right not to pull trumps AND right to retain a stopper in the suit in NT.)

I did inspect some number of the dealt hands, before running an automated script to deal out thousands of them. The most striking feature was that the hands where running to a suit made the most difference were the hands where responder was hopelessly weak and one suit was unprotected in notrump -- you don't get rich trying for 110 instead of 90, you get rich by conceding -100 instead of -400. Notrump on less than 20 HCP is often a huge disaster, while a 4-4 fit on less than 20 HCP is usually just fine and a 4-3 fit on less than 20 HCP is...well... not fun but often less painful by a couple tricks than 1NT would have been.

Anyone who is familiar with Deal 3.1 and cares to inspect the script is welcome to send me a PM.

The one way in which I consider this type of analysis most flawed has to do with the quality of the defense. I have done extensive sims of blind leads against 3NT and 4M reached by various auctions, and the less informative auctions consistently receive opening leads that blow a trick more often (to the tune of about 30% of the time vs. 2NT-3NT, 20% of the time vs 2NT-3C-3D-3NT, to 10% of the time vs 2NT-3C-3M-4M.) I have not done the same experiments on the impact of leads against partscores like we are talking about in this thread. If the effect is similar -- that the defense against 1NT is likely to be 0.2 tricks worse than against 1NT-2C-2H-Pass -- that does certainly swing some borderline cases against Stayman, but won't change 2C being right on a 3352 0-count.

#50 mikl_plkcc

Group: Full Members
Posts: 321
Joined: 2008-November-20
Gender:Male
Interests:sailing, bridge

Posted 2012-July-02, 21:44

Currently, I use crawling Stayman to deal with all 4-4 major hands with less than 8 HCPs. My notrump opening, as already mentioned in another post, has the following criteria:

Absolutely no voids or singletons
No two doubletons
No 5-card majors
No good 5-card minors (such as AKJxx)

I currently play non-forcing Stayman, with 2♥ as crawling and 2♠ as invitational with 5 ♠s and 5 or 4 ♥s which can be passed with ♠ fit.

For weak 5-4 or 5-5 hands, I go through crawling Stayman;
for invitational 5-4 or 5-5 hands, I either bid Stayman or transfer to ♥, and then bid 2♠, which is non-forcing;
for game-forcing 5-4 hands, I use Smolen transfer;
for game-forcing 5-5 hands, I transfer to the higher than bid the lower at the 3-level.

#51 y66

Group: Advanced Members
Posts: 6,496
Joined: 2006-February-24

Posted 2012-July-04, 15:13

cherdano, on 2012-July-02, 08:14, said:

Can you or anyone construct a bet about the size of the double dummy bias in the 2 situations you describe such that a panel of 3 objective forum posters could inspect a reasonable number of reasonable samples and judge if you won or not?

If you lose all hope, you can always find it again -- Richard Ford in The Sportswriter

#52 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2012-July-05, 04:17

Well, if you construct this bet I can search for every hand in vugraph history where 4-3 major fit was played and compare real results to double dummy results

#53 rhm

Group: Advanced Members
Posts: 3,092
Joined: 2005-June-27

Posted 2012-July-05, 05:12

bluecalm, on 2012-July-05, 04:17, said:

Well, if you construct this bet I can search for every hand in vugraph history where 4-3 major fit was played and compare real results to double dummy results

I offer a bet that the average number of tricks taken double dummy in 4-3 fits will be within 0.5 tricks of the average number of tricks taken single dummy over all 4-3 fits you can unearth.
If the number of 4-3 fit deals is really large (more than ten thousand) the difference will be less than 0.3 tricks.

Rainer Herrmann

#54 gnasher

Andy Bowles

Group: Advanced Members
Posts: 11,993
Joined: 2007-May-03
Gender:Male
Location:London, UK

Posted 2012-July-05, 05:22

bluecalm, on 2012-July-05, 04:17, said:

Well, if you construct this bet I can search for every hand in vugraph history where 4-3 major fit was played and compare real results to double dummy results

That's probably not a fair test, because many of those will be deals where the players chose to play in a 4-3 fit. Those will tend to be deals where the best line is more obvious.

... that would still not be conclusive proof, before someone wants to explain that to me as well as if I was a 5 year-old. - gwnn

#55 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2012-July-05, 08:18

Quote

That's probably not a fair test

I agree, can you think of better one ?

#56 Zelandakh

Group: Advanced Members
Posts: 10,696
Joined: 2006-May-18
Gender:Not Telling

Posted 2012-July-05, 08:26

bluecalm, on 2012-July-05, 08:18, said:

I agree, can you think of better one ?

Could you search specifically for the auctions:-
1NT - 2♣; 2♦
1NT - 2♣; 2♥
1NT - 2♣; 2♠
1NT - 2♣; 2♦ - 2♥
1NT - 2♣; 2♦ - 2♥; 2♠

and pull up only those that result in 7 card fits? If so, how many hands does this provide? If not enough then perhaps we can also come up with some additional auctions that fill the bill to reach a reasonable sample size.

(-: Zel :-)

#57 gnasher

Andy Bowles

Group: Advanced Members
Posts: 11,993
Joined: 2007-May-03
Gender:Male
Location:London, UK

Posted 2012-July-05, 16:35

Zelandakh, on 2012-July-05, 08:26, said:

Could you search specifically for the auctions:-
...
and pull up only those that result in 7 card fits?

Again, that's not a fair test, because the declaring side chose to follow a route that might lead to a 4-3 fit. Part of Cherdano's proposition was that it be an arbitrary 4-3 fit.

... that would still not be conclusive proof, before someone wants to explain that to me as well as if I was a 5 year-old. - gwnn

#58 gnasher

Andy Bowles

Group: Advanced Members
Posts: 11,993
Joined: 2007-May-03
Gender:Male
Location:London, UK

Posted 2012-July-05, 16:44

bluecalm, on 2012-July-05, 08:18, said:

I agree, can you think of better one ?

I'd be happy with y66's suggestion that "a panel of 3 objective forum posters could inspect a reasonable number of reasonable samples and judge if you won or not".

That reflects my personal view of what we're trying to achieve here. If someone can persuade me that a particular approach will more often lead to what I consider a better contract, I'll pay attention. If the best they can do is to show that a double-dummy solver would have done better, but to me the contracts look worse from a single-dummy perspective, I won't be very interested.

... that would still not be conclusive proof, before someone wants to explain that to me as well as if I was a 5 year-old. - gwnn

#59 rhm

Group: Advanced Members
Posts: 3,092
Joined: 2005-June-27

Posted 2012-July-06, 02:21

gnasher, on 2012-July-05, 16:44, said:

The only type of contract, where I know this might happen is with grand slams and even there the difference in practice is rather small. (Single dummy you should be more conservative when bidding a grand)

Apart from that show me a large number of random or pseudo random deals, filtered and selected by any Bridge criteria you like, where a double-dummy perspective will consistently favor a certain type of contract to be better than another one, where single-dummy you would come to the opposite conclusion.

I would be really interested, but I am pretty certain you are talking in mathematical terms about an empty set, of which you can claim anything you like without being proven wrong.
If you have a large sample the few double dummy oddities are all but irrelevant to the result.

Rainer Herrmann

#60 gnasher

Andy Bowles

Group: Advanced Members
Posts: 11,993
Joined: 2007-May-03
Gender:Male
Location:London, UK

Posted 2012-July-06, 03:21

rhm, on 2012-July-06, 02:21, said:

Why should I show you anything? You've asserted that, for the particular category of deals we're discussing, double-dummy analysis acurately models single-dummy play. So far as I can see, you have provided neither evidence nor argument in support of this assertion. And now you want me to do the work of testing it?

... that would still not be conclusive proof, before someone wants to explain that to me as well as if I was a 5 year-old. - gwnn

4 Pages
←
1
2
3
4
→

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: Stayman with 44M less than invite - BBO Discussion Forums

Stayman with 44M less than invite Yes? No? Maybe?

#41 rhm

#42 cherdano

#43 gnasher

#44 rhm

#45 cherdano

#46 fromageGB

#47 Cthulhu D

#48 awm

#49 Siegmund

#50 mikl_plkcc

#51 y66

#52 bluecalm

#53 rhm

#54 gnasher

#55 bluecalm

#56 Zelandakh

#57 gnasher

#58 gnasher

#59 rhm

#60 gnasher

4 User(s) are reading this topic
0 members, 4 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: Stayman with 44M less than invite - BBO Discussion Forums

Stayman with 44M less than invite Yes? No? Maybe?

4 User(s) are reading this topic 0 members, 4 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

4 User(s) are reading this topic
0 members, 4 guests, 0 anonymous users